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IMMUNISATION AGAINST CHLAMYDIA PNEUMONIAE 
Ail documents cited herein are incorporated by reference in their entirety, 
TECHNICAL FIELD 

This invention is in the field of immunisation against chlamydial infection, in particular against 
5 infection by Chlamydia pneumoniae, 

BACKGROUND ART 

Chlamydiae are obligate intracellular parasites of eukaryotic ceUs which are responsible for endemic 
sexually transmitted infections and various other disease syndromes. They occupy an exclusive 
eubacterial phylogenic branch, having no close relationship to any other known organisms - they are 

10 classified in their own order (Chlamydial) which contains a single family (Chlamydiaceae) which 
in turn contains a single genus {Chlamydia). A particular characteristic of the Chlamydiae is their 
unique life cycle, in which the bacterium alternates between two morphologically distinct form*, an 
extracellular infective form (elementary bodies, EB) and an intracellular non-infective form 
(reticulate bodies, RB). The life cycle is completed with the re-organization of RB into EB, which 

1 5 subsequently leave the disrupted host cell ready to infect further cells. 

Four chlamydial species are currently known - ^trachomatis, Cpneumoniae, Cpecorum and 
Cpsittaci {e.g. Raulston (1995) Mol Microbiol 15:607-616; Everett (2000) Vet Microbiol 75:109- 
126]. Cpneumoniae is closely related to C. trachomatis, as the whole genome comparison of at least 
two isolates from each species has shown [Kalman et al (1999) Nature Genetics 21:385-389; Read 
20 et al (2000) Nucleic Acids Res 28:1397-406; Stephens et al (1998) Science 282:754-759]. Based on 
surface reaction with patient immune sera, the current view is that only one serotype of 
Cpneumoniae exists world-wide. 

Cpneumoniae is a common cause of human respiratory disease. It was first isolated from the 
conjunctiva of a child in Taiwan in 1965, and was established as a major respiratory pathogen in 
25 1983. In the USA, Cpneumoniae causes approximately 10% of community-acquired pneumonia and 
5% of pharyngitis, bronchitis, and sinusitis. 

More recendy, the spectrum of Cpneumoniae infections has been extended to include 
atherosclerosis, coronary heart disease, carotid artery stenosis, myocardial infarction, cerebrovascular 
disease, aortic aneurysm, claudication, and stroke. The association of Cpneumoniae with 

30 atherosclerosis is corroborated by the presence of the organism in atherosclerotic lesions throughout 
the arterial tree and the near absence of the organism in healthy arterial tissue. Cpneumoniae has 
also been isolated from coronary and carotid atheromatous plaques. The bacterium has also been 
associated with other acute and chronic respiratory diseases (e.g. otitis media, chronic obstructive 
pulmonary disease, pulmonary exacerbation of cystic fibrosis) as a result of sero-epidemiologic 

35 observations, case reports, isolation or direct detection of the organism in specimens, and successful 
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response to anti-chlamydial antibiotics. To determine whether chronic infection plays a role in 
initiation or progression of disease, intervention studies in humans have been initiated, and animal 
models of Cpneumoniae infection have been developed. 

Considerable knowledge of the epidemiology of Cpneumoniae infection has been derived from 
serologic studies using the C.pneumoniae-specif 1C microimmunofluorescence test. Infection is 
ubiquitous, and it is estimated that virtually everyone is infected at some point in life, with common 
re-infection. Antibodies against Cpneumoniae are rare in children under the age of 5, except in 
developing and tropical countries. Antibody prevalence increases rapidly at ages 5 to 14, reaching 
50% at the age of 20, and continuing to increase slowly to -80% by age 70. 

A current hypothesis is that Cpneumoniae can persist in an asymptomatic low-grade infection in 
very large sections of the human population. When this condition occurs, it believed that the 
presence of Cpneumoniae, and/or the effects of the host reaction to the bacterium, can cause or help 
progress of cardiovascular illness. 

It is not yet clear whether Cpneumoniae is actually a causative agent of cardiovascular disease, or 
15 whether it is just artefactually associated with it. It has been shown, however, that Cpneumoniae 
infection can induce LDL oxidation by human monocytes [Kalayoglu et al. (1999) J. Infect. Dis. 
180:780-90; Kalayoglu et al. (1999) Am Heart J. 138:S488-490j. As LDL oxidation products are 
highly atherogenic, this observation provides a possible mechanism whereby Cpneumoniae may 
cause atheromatous degeneration. If a causative effect is confirmed, vaccination (prophylactic and 
20 therapeutic) will be universally recommended. 

Genomic sequence information has been published for Cpneumoniae [Kalman et al. (1999) supra; 
Read et al. (2000) supra; Shirai et al. (2000) J. Infect. Dis. 181(Suppl 3):S524-S527; WO99/27105; 
WO00/27994] and is available from GenBank. Sequencing efforts have not, however, focused on 
vaccination, and the availability of genomic sequence does not in itself indicate which of the >1000 
25 genes might encode useful antigens for immunisation and vaccination. WO99/27105, for instance, 
implies that every one of the 1296 ORFs identified in the Cpneumoniae strain CM1 genome is a 
useful vaccine antigen. 

It is thus an object of the present invention to identify antigens useful for vaccine production and 
development from amongst the many proteins present in Cpneumoniae. It is a further object to 
30 identify antigens useful for diagnosis (e.g. imrnunodiagnosis) of Cpneumoniae. 

DISCLOSURE OF THE INVENTION 

The invention provides proteins comprising the Cpneumoniae amino acid sequences disclosed in the 
examples. 

It also provides proteins comprising sequences which share at least x% sequence identity with the 
35 Cpneumoniae amino acid sequences disclosed in the examples. Depending on the particular 
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sequence, x is preferably 50% or more (e.g. 60%, 70%, 80%, 90%, 95%, 99% or more). These 
include mutants and allelic variants. Typically, 50% identity or more between two proteins is 
considered to be an indication of functional equivalence. Identity between proteins is preferably 
determined by the Smith-Waterman homology search algorithm as implemented in the MPSRCH 
program (Oxford Molecular), using an affine gap search with parameters gap open penalty^ J 2 and 
gap extension penalty— 1. 

The invention further provides proteins comprising fragments of the Cpneumoniae amino acid 
sequences disclosed in the examples. The fragments should comprise at least n consecutive amino 
acids from the sequences and, depending on the particular sequence, n is 7 or more (e.g. 8, 10, 12, 
14, 16, 18, 20, 30, 40, 50, 75, 100 or more). Preferably the fragments -comprise one or more 
epitope(s) from the sequence. Other preferred fragments omit a signal peptide. 

The proteins of the invention can, of course, be prepared by various means (e.g. native expression, 
recombinant expression, purification from cell culture, chemical synthesis etc.) and in various forms 
(e.g. native, fusions etc.). They are preferably prepared in substantially pure form (ie. substantially 
15 free from other Cpnewnoniae or host cell proteins). Heterologous expression in Kcoli is a preferred 
preparative route. 

According to a further aspect, the invention provides nucleic acid comprising the Cpneumoniae 
nucleotide sequences disclosed in the examples. In addition, the invention provides nucleic acid 
comprising sequences which share at least x% sequence identity with the C.pmumoniae nucleotide 
20 sequences disclosed in the examples. Depending on the particular sequence, x is preferably 50% or 
more (e.g. 60%, 70%, 80%, 90%, 95%, 99% or more). 

Furthermore, the invention provides nucleic acid which can hybridise to the Cpneumoniae nucleic 
acid disclosed in the examples, preferably under "high stringency" conditions (e.g. 65°C in a 
O.lxSSC, 0.5% SDS solution). 

25 Nucleic acid comprising fragments of these sequences are also provided. These should comprise at 
least n consecutive nucleotides from the Cpneumoniae sequences and, depending on the particular 
sequence, n is 10 or more (e.g. 12, 14, 15, 18, 20, 25, 30, 35, 40, 50, 75, 100, 200, 300 or more). 

According to a further aspect, the invention provides nucleic acid encoding the proteins and protein 
fragments of the invention. 

30 It should also be appreciated that the invention provides nucleic acid comprising sequences 
complementary to those described above (e.g. for antisense or probing purposes). 

Nucleic acid according to the invention can, of course, be prepared in many ways (e.g. by chemical 
synthesis, from genomic or cDNA libraries, from the organism itself etc.) and can take various forms 
(e.g. single stranded, double stranded, vectors, probes etc.). 
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In addition, the term "nucleic acid" includes DNA and RNA, and also their analogues, such as those 
containing modified backbones, and also peptide nucleic acids (PNA) etc. 

According to a further aspect, the invention provides vectors comprising nucleotide sequences of the 
invention {e.g. cloning or expression vectors) and host cells transformed therewith. 
According to a further aspect, the invention provides immunogenic compositions comprising protein 
and/or nucleic acid according to the invention. These compositions are suitable for immunisation and 
vaccination purposes. Vaccines of the invention may be prophylactic or therapeutic, and will 
typically comprise an antigen which can induce antibodies capable of inhibiting (a) chlamydial 
adhesion, (b) chlamydial entry, and/or (c) successful replication within the host cell. The vaccines 
preferably induce any cell-mediated T-cell responses which are necessary for chlamydial clearance 
from the host. 

The invention also provides nucleic acid or protein according to the invention for use as 
medicaments {e.g. as vaccines). It also provides the use of nucleic acid or protein according to the 
invention in the manufacture of a medicament (e.g. a vaccine or an immunogenic composition) for 
15 treating or preventing infection due to Cpneumoniae. 

The invention also provides a method of treating {e.g. immunising) a patient, comprising 
administering to the patient a therapeutically effective amount of nucleic acid or protein according to 
the invention. 

According to further aspects, the invention provides various processes. 

20 A process for producing proteins of the invention is provided, comprising the step of culturing a host 
cell according to the invention under conditions which induce protein expression. 

A process for producing protein or nucleic acid of the invention is provided, wherein the protein or 
nucleic acid is synthesised in part or in whole using chemical means. 

A process for detecting Cpneumoniae in a sample is provided, wherein the sample is contacted with 
25 an antibody which binds to a protein of the invention . 

A summary of standard techniques and procedures which may be employed in order to perform the 
invention {e.g. to utilise the disclosed sequences for immunisation) follows. This summary is not a 
limitation on the invention but, rather, gives examples that may be used, but are not required. 

General 

30 The practice of the present invention will employ, unless otherwise indicated, conventional techniques of 
molecular biology, microbiology, recombinant DNA, and immunology, which are within the skill of the art 
Such techn.ques are explained fully in the literature e.g. Sam brook Molecular Cloning; A Laboratory Manual, 
Second Edition (1989) and Third Edition (2001); DNA Cloning, Volumes I and ii (D.N Glover ed. 1985); 
Oligonucleotide Synthesis (M.J. Gait ed, 1984); Nucleic Acid Hybridization (B.D. Hames & S.J. Higgins eds 

35 1984); Transcription and Translation (B.D. Hames & S.J. Higgins eds. 1984); Animal Cell Culture (R.I. 
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Freshney ed. 1986); Immobilized Cells and Enzymes (IRL Press, 1986); B. Perbal, A Practical Guide to 
Molecular Cloning (1984); the Methods in Enzymology series (Academic Press, Inc.), especially volumes 154 & 
155; Gene Transfer Vectors for Mammalian Cells (J.H. Miller and M.P. Calos eds. 1987, Cold Spring Harbor 
Laboratory); Mayer and Walker, eds. (1987), Immunochemical Methods in Cell and Molecular Biology 
(Academic Press, London); Scopes, (1987) Protein Purification: Principles and Practice, Second Edition 
(Spnnger-Verlag, N.Y.), and Handbook of Experimental Immunology, Volumes 1-1V (D M Weir and C C 
Blackwelleds 1986). ' 

Standard abbreviations for nucleotides and amino acids are used in this specification. 
Definitions 

A composition containing X is "substantially free of Y when at least 85% by weight of the total X+Y in the 
composition is X. Preferably, X comprises at least about 90% by weight of the total of X+Y in the composition 
more preferably at least about 95% or even 99% by weight. 

The term "comprising" means "including" as well as "consisting" e.g. a composition "comprising" X may 
consist exclusively of X or may include something additional to X, such as X+Y. 

The term "heterologous" refers to two biological components that are not found together in nature. The 
components may be host cells, genes, or regulatory regions, such as promoters. Although the heterologous 
components are not found together in nature, they can function together, as when a promoter heterologous to a 
gene it operably linked to the gene. Another example is where a Chlamydial sequence is heterologous to a 
mouse host cell. A further examples would be two epitopes from the same or different proteins which have been 
20 assembled in a single protein in an arrangement not found in nature. 

An "origin of replication" is a polynucleotide sequence that initiates and regulates replication of polynucleotides 
such as an expression vector. The origin of replication behaves as an autonomous unit of polynucleotide 
replication within a cell, capable of replication under its own control. An origin of replication may be needed for 
a vector to replicate in a particular host cell. With certain origins of replication, an expression vector can be 
reproduced at a high copy number in the presence of the appropriate proteins within the cell. Examples of 
origins are the autonomously replicating sequences, which are effective in yeast; and the viral T-antieen 
effective in COS-7 cells. ' 

A "mutant" sequence is defined as DNA, RNA or amino acid sequence differing from but having sequence 
identity with the nat.ve or disclosed sequence. Depending on the particular sequence, the degree of sequence 
identity between the native or disclosed sequence and the mutant sequence is preferably greater than 50% (e g 
60%, 70%, 80%, 90%, 95%, 99% or more, calculated using the Smith-Waterman algorithm as described above)' 
As used herein, an "allelic variant" of a nucleic acid molecule, or region, for which nucleic acid sequence is 
provided herein is a nucleic acid molecule, or region, that occurs essentially at the same locus in the genome of 
another or second isolate, and that, due to natural variation caused by, for example, mutation or recombination 
has a similar but not identical nucleic acid sequence. A coding region allelic variant typically encodes a protein 
having similar activity to that of the protein encoded by the gene to which it is being compared. An allelic 
variant can also comprise an alteration in the 5' or 3' untranslated regions of the gene, such as in regulatory 
control regions {e.g. see US patent 5,753,235). 



25 
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Expression systems 

The Chlamydial nucleotide sequences can be expressed in a variety of different expression systems; for example 
those used with mammalian cells, baculoviruses, plants, bacteria, and yeast. 

i. Mammalian Systems 

Mammalian expression systems are known in the art. A mammalian promoter is any DNA sequence capable of 
binding mammalian RNA polymerase and initiating the downstream (3') transcription of a coding sequence (e.g. 
structural gene) into mRNA. A promoter will have a transcription initiating region, which is usually placed 
proximal to the 3' end of the coding sequence, and a TATA box, usually located 25-30 base pairs (bp) upstream 
of the transcription initiation site. The TATA box is thought to direct RNA polymerase II to begin RNA 
synthesis at the correct site. A mammalian promoter will also contain an upstream promoter element, usually 
located within 100 to 200 bp upstream of the TATA box. An upstream promoter element determines the rate at 
which transcription is initiated and can act in either orientation [Sambrook et al. (1989) "Expression of Cloned 
Genes in M ammalian Cells." In Molecular Cloning: A Laboratory Manual, 2nd ed.]. 

Mammalian viral genes are often highly expressed and have a broad host range; therefore sequences encoding 
mammalian viral genes provide particularly useful promoter sequences. Examples include the SV40 early 
promoter, mouse mammary tumor virus LTR promoter, adenovirus major late promoter (Ad MLP), and herpes 
simplex virus promoter. In addition, sequences derived from non-viral genes, such as the murine 
metallotheionein gene, also provide useful promoter sequences. Expression may be either constitutive or 
regulated (inducible), depending on the promoter can be induced with glucocorticoid in hormone-responsive 
20 cells. 

The presence of an enhancer element (enhancer), combined with the promoter elements described above, will 
usually increase expression levels. An enhancer is a regulatory DNA sequence that can stimulate transcription up 
to 1000-fold when linked to homologous or heterologous promoters, with synthesis beginning at the normal 
RNA start site. Enhancers are also active when they are placed upstream or downstream from the transcription 

25 initiation site, in either normal or flipped orientation, or at a distance of more than 1000 nucleotides from the 
promoter [Maniatis et al. (1987) Science 236:1237; Alberts et al. (1989) Molecular Biology of the Cell, 2nd ed.]. 
Enhancer elements derived from viruses may be particularly useful, because they usually have a broader host 
range. Examples include the SV40 early gene enhancer [Dijkema et al (1985) EMBO J. 4:761] and the 
enhancer/promoters derived from the long terminal repeat (LTR) of the Rous Sarcoma Virus [Gorman et al. 

30 (19&2) PNAS USA 79:6777] and from human cytomegalovirus [Boshart et al. (1985) Cell 41:521]. Additionally, 
some enhancers are regulable and become active only in the presence of an inducer, such as a hormone or 
metal ion [Sassone-Corsi and Borelli (1986) Trends Genet. 2:215; Maniatis et al. (1987) Science 236:1237]. 
A DNA molecule may be expressed intracellularly in mammalian cells. A promoter sequence may be directly 
linked with the DNA molecule, in which case the first amino acid at the N-terminus of the recombinant protein 

35 will always be a methionine, which is encoded by the ATG start codon. If desired, the N-terminus may be 
cleaved from the protein by in vitro incubation with cyanogen bromide, 

Alternatively, foreign proteins can also be secreted from the cell into the growth media by creating chimeric 
DNA molecules that encode a fusion protein comprised of a leader sequence fragment that provides for secretion 
of the foreign protein in mammalian cells. Preferably, there are processing sites encoded between the leader 
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fragment and the foreign gene that can be cleaved either in vivo or in vitro. The leader sequence fragment 
usually encodes a signal peptide comprised of hydrophobic amino acids which direct the secretion of the protein 
from the cell. The adenovirus triparite leader is an example of a leader sequence that provides for secretion of a 
foreign protein in mammalian cells. 

Usually, transcription termination and polyadenylation sequences recognized by mammalian cells are regulatory 
regions located 3' to the translation stop codon and thus, together with the promoter elements, flank the coding 
sequence. The 3' terminus of the mature mRNA is formed by site-specific post-transcriptional cleavage and 
polyadenylation [Birnstiel et al. (1985) Cell 47:349; Proudfoot and Whitelaw (1988) "Termination and 3' end 
processing of eukaryotic RNA. In Transcription and splicing (ed. B.D. Hames and D.M. Glover); Proudfoot 
(1989) Trends Biochem. Sci. 14:105]. These sequences direct the transcription of an mRNA which can be 
translated into the polypeptide encoded by the DNA. Examples of transcription terminater/polyadenylation 
s.gnals include those derived from SV40 [Sambrook et al (1989) "Expression of cloned genes in cultured 
mammalian cells." In Molecular Cloning: A Laboratory Manual]. 

Usually, the above described components, comprising a promoter, polyadenylation signal, and transcription 
termination sequence are put together into expression constructs. Enhancers, introns with functional splice donor 
and acceptor sites, and leader sequences may also be included in an expression construct, if desired. Expression ' 
constructs are often maintained in a replicon, such as an extrachromosomal element (e.g. plasmids) capable of 
stable maintenance in a host, such as mammalian cells or bacteria. Mammalian replication systems include those 
derived from animal viruses, which require trans-acting factors to replicate. For example, plasmids containing 
the replication systems of papovaviruses, such as SV40 [Gluzman (1981) Cell 23:115] or polyomavirus, 
replicate to extremely high copy number in the presence of the appropriate viral T antigen. Additional examples' 
of mammalian replicons include those derived from bovine papillomavirus and Epstein-Barr virus. Additionally, 
the replicon may have two replicaton systems, thus allowing it to be maintained, for example, in mammalian 
cells for expression and in a prokaryotic host for cloning and amplification. Examples of such mammalian- 
bacteria shuttle vectors include P MT2 [Kaufman et al. (1989) Mol. Cell. Biol. 9:946] and pHEBO [Shimizu et al 
(1986) JM. Cell. Biol. 6:1014]. 

The transformation procedure used depends upon the host to be transformed. Methods for introduction of 
heterologous polynucleotides into mammalian cells are known in the art and include dextran-mediated 
transfection, calcium phosphate precipitation, polybrene-mediated transfection, protoplast fusion, 
electroporation, encapsulation of polynucleotide(s) in liposomes, direct microinjection of the DNA into nuclei. 
Mammalian cell lines available as hosts for expression are known in the art and include many immortalized cell 
lines available from the American Type Culture Collection (ATCC), including but not limited to, Chinese 
hamster ovary (CHO) cells, HeLa cells, baby hamster kidney (BHK) cells, monkey kidney cells (COS), human 
hepatocellular carcinoma cells (e.g. Hep G2), and a number of other cell lines. 

35 ii. Bacnlovirus Systems 

The polynucleotide encoding the protein can also be inserted into a suitable insect expression vector, and is 
operably linked to the control elements within that vector. Vector construction employs techniques which are' 
known in the art. Generally, the components of the expression system include a transfer vector, usually a 
bacterial plasmid, which contains both a fragment of the bacnlovirus genome, and a convenient restriction site 
40 for insertion of the heterologous gene or genes to be expressed; a wild type bacnlovirus with a sequence 



25 
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homologous to the baculovirus-specific fragment in the transfer vector (this allows for the homologous 
recombination of the heterologous gene in to the baculovirus genome); and appropriate insect host cells and 
growth media. 

After inserting the DNA sequence encoding the protein into the transfer vector, the vector and the wild type viral 
5 genome are transfected into an insect host cell where the vector and viral genome are allowed to recombine. The 
packaged recombinant virus is expressed and recombinant plaques are identified and purified. Materials and 
methods for baculovirus/insect cell expression systems are commercially available in kit form from, inter alia, 
Invitrogen, San Diego CA ("MaxBac" kit). These techniques are generally known to those skilled in the art and 
fully described in Summers and Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987) 
10 (hereinafter "Summers and Smith"). 

Prior to inserting the DNA sequence encoding the protein into the baculovirus genome, the above described 
components, comprising a promoter, leader (if desired), coding sequence of interest, and transcription 
termination sequence, are usually assembled into an intermediate transplacement construct (transfer vector). This 
construct may contain a single gene and opaably linked regulatory elements; multiple genes, each with its 
15 owned set of operably linked regulatory elements; or multiple genes, regulated by the same set of regulatory 
elements. Intermediate transplacement constructs are often maintained in a replicon, such as an 
extrachromosomal element (e.g. plasmids) capable of stable maintenance in a host, such as a bacterium. The 
replicon will have a replication system, thus allowing it to be maintained in a suitable host for cloning and 
amplification. 

20 Currently, the most commonly used transfer vector for introducing foreign genes into AcNPV is pAc373. Many 
other vectors, known to those of skill in the art, have also been designed. These include, for example, pVL985 
(which alters the polyhedrin start codon from ATG to ATT, and which introduces a BamHl cloning site 32 
basepairs downstream from the ATT; see Luckow and Summers, Virology (1989) 77:31. 

The plasmid usually also contains the polyhedrin polyadenylation signal (Miller et al. (1988) Ann. Rev. 
25 Microbiol, 42:111) and a prokaryotic ampicillin-resistance (amp) gene and origin of replication for selection 
and propagation in E. coli. 

Baculovirus transfer vectors usually contain a baculovirus promoter. A baculovirus promoter is any DNA 
sequence capable of binding a baculovirus RNA polymerase and initiating the downstream (5' to 3') transcription 
of a coding sequence (e.g. structural gene) into mRNA. A promoter will have a transcription initiation region 
30 which is usually placed proximal to the 5' end of the coding sequence. This transcription initiation region usually 
includes an RNA polymerase binding site and a transcription initiation site. A baculovirus transfer vector may 
also have a second domain called an enhancer, which, if present, is usually distal to the structural gene. 
Expression may be either regulated or constitutive. 

Structural genes, abundantly transcribed at late times in a viral infection cycle, provide particularly useful 
35 promoter sequences. Examples include sequences derived from the gene encoding the viral polyhedron protein, 
Friesen et al, (1986) "The Regulation of Baculovirus Gene Expression," in: The Molecular Biology of 
Baculoviruses (ed. Walter Doerfler); EPO Publ. Nos, 127 839 and 155 476; and the gene encoding the plO 
protein, Vlak et al, (1988), /. Gen. Virol 69:165. 

DNA encoding suitable signal sequences can be derived from genes for secreted insect or baculovirus proteins, 
40 such as the baculovirus polyhedrin gene (Carbonell et al. (1988) Gene, 73:409). Alternatively, since the signals 
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for mammalian cell posttranslational modifications (such as signal peptide cleavage, proteolytic cleavage, and 
phosphorylation) appear to be recognized by insect cells, and the signals required for secretion and nuclear 
accumulation also appear to be conserved between the invertebrate cells and vertebrate cells, leaders of non- 
insect origin, such as those derived from genes encoding human oc-interferon, Maeda et al., (1985), Nature 
315:592; human gastrin-releasing peptide, Lebacq-Verheyden et al., (1988), Uolec. Cell. Biol 5:3129- human 
IL-2, Smith et al., (1985) Proc. Nat'l Acad. Sci. USA, 52:8404; mouse IL-3, (Miyajima et al., (1987) Gene 
55:273; and human glucocerebrosidase, Martin et al. (1988) DNA, 7:99, can also be used to provide for secretion 
in insects. 

A recombinant polypeptide or polyprotein may be expressed intracellular^ or, if it is expressed with the proper 
regulatory sequences, it can be secreted. Good intracellular expression of nonfused foreign proteins usually 
requires heterologous genes that ideally have a short leader sequence containing suitable translation initiation 
signals preceding an ATG start signal. If desired, methionine at the N-terminus may be cleaved from the mature 
protein by in vitro incubation with cyanogen bromide. 

Alternatively, recombinant polyproteins or proteins which are not naturally secreted can be secreted from the 
insect cell by creating chimeric DNA molecules that encode a fusion protein comprised of a leader sequence 
fragment that provides for secretion of the foreign protein in insects. The leader sequence fragment usually 
encodes a signal peptide comprised of hydrophobic amino acids which direct the translocation of the protein into 
the endoplasmic reticulum. 

After insertion of the DNA sequence and/or the gene encoding the expression product precursor of the protein, 
an insect cell host is co-transformed with the heterologous DNA of the transfer vector and the genomic DNA of 
wild type baculovirus - usually by co-transfection. The promoter and transcription termination sequence of the 
construct will usually comprise a 2-5kb section of the baculovirus genome. Methods for introducing 
heterologous DNA into the desired site in the baculovirus virus are known in the art. (See Summers and Smith 
supra; Ju et al. (1987); Smith et al., Mol. Cell. Biol. (1983) 3:2156; and Luckow and Summers (1989)). For 
example, the insertion can be into a gene such as the polyhedrin gene, by homologous double crossover 
recombination; insertion can also be into a restriction enzyme site engineered into the desired baculovirus gene. 
Miller et al., (1989), Bioessays 4:91.The DNA sequence, when cloned in place of the polyhedrin gene in the 
expression vector, is flanked both 5' and 3' by polyhedrin-specific sequences and is positioned downstream of 
the polyhedrin promoter. 

30 The newly formed baculovirus expression vector is subsequently packaged into an infectious recombinant 
baculovirus. Homologous recombination occurs at low frequency (between -1% and -5%); thus, the majority of 
the virus produced after cotransfection is still wild-type virus. Therefore, a method is necessary to identify 
recombinant viruses. An advantage of the expression system is a visual screen allowing recombinant viruses to 
be distinguished. The polyhedrin protein, which is produced by the native virus, is produced at very high levels 

35 in the nuclei of infected cells at late times after viral infection. Accumulated polyhedrin protein forms occlusion 
bodies that also contain embedded particles. These occlusion bodies, up to 15pm in size, are highly retractile, 
giving them a bright shiny appearance that is readily visualized under the light microscope. Cells infected with 
recombinant viruses lack occlusion bodies. To distinguish recombinant virus from wild-type virus, the 
transfection supernatant is plaqued onto a monolayer of insect cells by techniques known to those skilled in the 

40 art. Namely, the plaques are screened under the light microscope for the presence (indicative of wild-type virus) 
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or absence (indicative of recombinant virus) of occlusion bodies. "Current Protocols in Microbiology" Vol. 2 
(Ausubel et al. eds) at 16.8 (Supp. 10, 1990); Summers & Smith, supra] Miller et ah (1989). 

Recombinant baculovirus expression vectors have been developed for infection into several insect cells. For 
example, recombinant baculoviruses have been developed for, inter alia\ Aedes aegypti , Autographa 
5 californica, Bombyx morl Drosophila melanogaster, Spodoptera fru$iperda t and Trichopksia ni (WO 
89/046699; Carbonell et al., (1985) J. Virol. 55:153; Wright (1986) Nature 527:718; Smith et al., (1983) Mol 
Cell Biol 3:2156; and see generally, Fraser, et al (1989) In Vitro Cell Dev. Biol 25:225). 

Cells and cell culture media are commercially available for both direct and fusion expression of heterologous 
polypeptides in a baculovirus/expression system; cell culture technology is generally known to those skilled in 
10 the art. See, e.g. Summers and Smith supra. 

The modified insect cells may then be grown in an appropriate nutrient medium, which allows for stable 
maintenance of the plasmid(s) present in the modified insect host. Where the expression product gene is under 
inducible control, the host may be grown to high density, and expression induced. Alternatively, where 
expression is constitutive, the product will be continuously expressed into the medium and the nutrient medium 

15 must be continuously circulated, while removing the product of interest and augmenting depleted nutrients. The 
product may be purified by such techniques as chromatography, e.g. HPLC, affinity chromatography, ion 
exchange chromatography, etc.; electrophoresis; density gradient centrifugation; solvent extraction, or the like. 
As appropriate, the product may be further purified, as required, so as to remove substantially any insect proteins 
which are also secreted in the medium or result from lysis of insect cells, so as to provide a product which is at 

20 least substantially free of host debris, e.g. proteins, lipids and polysaccharides. 

In order to obtain protein expression, recombinant host cells derived from the transformants are incubated under 
conditions which allow expression of the recombinant protein encoding sequence. These conditions will vary, 
dependent upon the host cell selected. However, the conditions are readily ascertainable to those of ordinary skill 
in the art, based upon what is known in the art. 

25 iii. Plant Systems 

There are many plant cell culture and whole plant genetic expression systems known in the art. Exemplary plant 
cellular genetic expression systems include those described in patents, such as: US 5,693,506; US 5,659,122; 
and US 5,608,143. Additional examples of genetic expression in plant cell culture has been described by Zenk, 
Phytochemistry 30:3861-3863 (1991). Descriptions of plant protein signal peptides may be found in addition to 

30 the references described above in Vaulcombe et al., Mol Gen. Genet. 209:33-40 (1987); Chandler et al., Plant 
Molecular Biology 3:407-418 (1984); Rogers, J t Biol Chem. 260:3731-3738 (1985); Rothstein et al., Gene 
55:353-356 (1987); Whittier et al. Nucleic Acids Research 15:2515-2535 (1987); Wirsel et al, Molecular 
Microbiology 3:3-14 (1989); Yu et al. } Gene 122:247-253 (3992). A description of the regulation of plant gene 
expression by the phytohormone, gibberellic acid and secreted enzymes induced by gibberellic acid can be found 

35 in R.L. Jones and J. MacMillin, Gibberellins: in: Advanced Plant Physiology,, Malcolm B, Wilkins, ed, 1984 
Pitman Publishing Limited, London, pp. 21-52. References that describe other metaboiically-regulated genes: 
Sheen, Plant Cell, 2:1027-1038(1990); Maas et al., EMBO J. 9:3447-3452 (1990); Benkel and Hickey, Proc. 
Nail Acad. Sci. 84:1337-1339 (3987) 
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Typically, using techniques known in the art, a desired polynucleotide sequence is inserted into an expression 
cassette comprising genetic regulatory elements designed for operation in plants. The expression cassette is 
inserted into a desired expression vector with companion sequences upstream and downstream from the 
expression cassette suitable for expression in a plant host. The companion sequences will be of plasmid or viral 
5 origin and provide necessary characteristics to the vector to permit the vectors to move DNA from an original 
cloning host, such as bacteria, to the desired plant host. The basic bacterial/plant vector construct will preferably 
provide a broad host range prokaryote replication origin; a prokaryote selectable marker; and, for Agrobacterium 
transformations, T DNA sequences for Agrobacterium-mediated transfer to plant chromosomes. Where the 
heterologous gene is not readily amenable to detection, the construct will preferably also have a selectable 
10 marker gene suitable for determining if a plant cell has been transformed. A general review of suitable markers, 
for example for the members of the grass family, is found in Wilmink and Dons, 1993, Plant Mol. Biol Reptr, 
11(2):165-185. 

Sequences suitable for permitting integration of the heterologous sequence into the plant genome are also 
recommended. These might include transposon sequences and the like for homologous recombination as well as 
15 Ti sequences which permit random insertion of a heterologous expression cassette into a plant genome. Suitable 
prokaryote selectable markers include resistance toward antibiotics such as ampicillin or tetracycline. Other 
DNA sequences encoding additional functions may also be present in the vector, as is known in the art. 

The nucleic acid molecules of the subject invention may be included into an expression cassette for expression 
of the protein(s) of interest. Usually, there will be only one expression cassette, although two or more are 
20 feasible. The recombinant expression cassette will contain in addition to the heterologous protein encoding 
sequence the following elements, a promoter region, plant 5' untranslated sequences, initiation codon depending 
upon whether or not the structural gene comes equipped with one, and a transcription and translation termination 
sequence. Unique restriction enzyme sites at the 5 1 and 3 1 ends of the cassette allow for easy insertion into a pre- 
existing vector. 

25 A heterologous coding sequence may be for any protein relating to the present invention. The sequence encoding 
the protein of interest will encode a signal peptide which allows processing and translocation of the protein, as 
appropriate, and will usually lack any sequence which might result in the binding of the desired protein of the 
invention to a membrane. Since, for the most part, the transcriptional initiation region will be for a gene which is 
expressed and translocated during germination, by employing the signal peptide which provides for 

30 translocation, one may also provide for translocation of the protein of interest. In this way, the protein(s) of 
interest will be translocated from the cells in which they are expressed and may be efficiently harvested. 
Typically secretion in seeds are across the aleurone or scuteilar epithelium layer into the endosperm of the seed. 
While it is not required that the protein be secreted from the cells in which the protein is produced, this 
facilitates the isolation and purification of the recombinant protein. 

35 Since the ultimate expression of the desired gene product will be in a eucaryotic cell it is desirable to determine 
whether any portion of the cloned gene contains sequences which will be processed out as introns by the host's 
splicosome machinery. If so, site-directed mutagenesis of the "intron" region may be conducted to prevent losing 
a portion of the genetic message as a false intron code, Reed and Maniatis, Cell 41 :95-I05, 1985. 

The vector can be microinjected directly into plant cells by use 0 f micropipettes to mechanically transfer the 
40 recombinant DNA. Crossway, Mol Gen. Genet, 202:179-185, 1985. The genetic material may also be 
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transferred into the plant cell by using polyethylene glycol, Krens, et al., Nature, 296, 72-74, 1982. Another 
method of introduction of nucleic acid segments is high velocity ballistic penetration by small particles with the 
nucleic acid either within the matrix of small beads or particles, or on the surface, Klein, et al., Nature, 327, 70- 
73, 1987 and Knudsen and Mailer. 1991, Phnta, 185:330-336 teaching particle bombardment of barley 
5 endosperm to create transgenic barley. Yet another method of introduction would be fusion of protoplasts with 
other entities, either minicells, cells, lysosomes or other fusible lipid-surfaced bodies, Fraley, et al., Proc. Natl. 
Acad.Sci.USA,19,n59-im,m2. 

The vector may also be introduced into the plant cells by electroporation. (Fromm et al., Proc. Natl Acad. Set. 
USA 82:5824, 1985). In this technique, plant protoplasts are electroporated in the presence of plasmids 
10 containing the gene construct. Electrical impulses of high field strength reversibly permeabilize biomembranes 
allowing the introduction of the plasmids. Electroporated plant protoplasts reform the cell wall, divide, and form 
plant callus. 

All plants from which protoplasts can be isolated and cultured to give whole regenerated plants can be 
transformed by the present invention so that whole plants are recovered which contain the transferred gene. It is 

15 known that practically all plants can be regenerated from cultured cells or tissues, including but not limited to all 
major species of sugarcane, sugar beet, cotton, fruit and other trees, legumes and vegetables. Some suitable 
plants include, for example, species from the genera Fragaria, Lotus, Medicago, Onobrychh, Trifolium, 
Trigonella, Vigna, Citrus, Linum, Geranium, Manihot, Daucus, Arabidopsis, Brassica, Raphanus, Sinapis, 
Atropa, Capsicum, Datura, Hyoscyamus, Lycopersion, Nicotiana, Solatium, Petunia, Digitalis, Majorana, 

20 Cichorium, Helianthus, Lactuca, Bromus, Asparagus, Antirrhinum, Hererocallis, Nemesia, Pelargonium, 
Panicum, Pennisetum, Ranunculus, Senecio, Salpiglossis, Cucumis, Browaalia, Glycine, Lolium, lea, Triticum, 
Sorghum, and Datura. 

Means for regeneration vary from species to species of plants, but generally a suspension of transformed 
protoplasts containing copies of the heterologous gene is first provided. Callus tissue is formed and shoots may 

25 be induced from callus and subsequently rooted. Alternatively, embryo formation can be induced from the 
protoplast suspension. These embryos germinate as natural embryos to form plants. The culture media will 
generally contain various amino acids and hormones, such as auxin and cytokinins. It is also advantageous to 
add glutamic acid and proline to the medium, especially for such species as corn and alfalfa. Shoots and roots 
normally develop simultaneously. Efficient regeneration will depend on the medium, on the genotype, and on 

30 the history of the culture. If these three variables are controlled, then regeneration is fully reproducible and 
repeatable. 

In some plant cell culture systems, the desired protein of the invention may be excreted or alternatively, the 
protein may be extracted from the whole plant. Where the desired protein of the invention is secreted into the 
medium, it may be collected. Alternatively, the embryos and embryoless-half seeds or other plant tissue may be 
35 mechanically disrupted to release any secreted protein between cells and tissues. The mixture may be suspended 
in a buffer solution to retrieve soluble proteins. Conventional protein isolation and purification methods will be 
then used to purify the recombinant protein. Parameters of time, temperature pH, oxygen, and volumes will be 
adjusted through routine methods to optimize expression and recovery of heterologous protein. 
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iv. Bacterial Systems 

Bacterial expression techniques are known in the art. A bacterial promoter is any DNA sequence capable of 
binding bacterial RNA polymerase and initiating the downstream (3') transcription of a coding sequence (e.g. 
structural gene) into mRNA. A promoter will have a transcription initiation region which is usually placed 
5 proximal to the 5' end of the coding sequence. This transcription initiation region usually includes an RNA 
polymerase binding site and a transcription initiation site. A bacterial promoter may also have a second domain 
called an operator, that may overlap an adjacent RNA polymerase binding site at which RNA synthesis begins. 
The operator permits negative regulated (inducible) transcription, as a gene repressor protein may bind the 
operator and thereby inhibit transcription of a specific gene. Constitutive expression may occur in the absence of 

10 negative regulatory elements, such as the operator. In addition, positive regulation may be achieved by a gene 
activator protein binding sequence, which, if present is usually proximal (5 l ) to the RNA polymerase binding 
sequence. An example of a gene activator protein is the catabolite activator protein (CAP), which helps initiate 
transcription of the lac operon in Escherichia coli (E. coli) [Raibaud €t al (1984) Amu. Rev. Genet. 18:113]. 
Regulated expression may therefore be either positive or negative, thereby either enhancing or reducing 

15 transcription. 

Sequences encoding metabolic pathway enzymes provide particularly useful promoter sequences, Examples 
include promoter sequences derived from sugar metabolizing enzymes, such as galactose, lactose (lac) [Chang et 
al (1977) Nature 198:1056], and maltose. Additional examples include promoter sequences derived from 
biosynthetic enzymes such as tryptophan (trp) [Goeddel et al (1980) Nuc. Acids Res. 5:4057; Yelverton et al 
20 (1981) Nucl. Acids Res. 5:731; US patent 4,738,921; EP-A-0036776 and EP-A-0121775]. The g-laotamase (bla) 
promoter system [Weissmann (1981) "The cloning of interferon and other mistakes." In Interferon 3 (ed, I. 
Gresser)], bacteriophage lambda PL [Shimatake et al (1981) Nature 292:128] and T5 [US patent 4,689,406] 
promoter systems also provide useful promoter sequences. 

In addition, synthetic promoters which do not occur in nature also function as bacterial promoters. For example, 
25 transcription activation sequences of one bacterial or bacteriophage promoter may be joined with the operon 
sequences of another bacterial or bacteriophage promoter, creating a synthetic hybrid promoter [US 
patent 4,551,433]. For example, the tac promoter is a hybrid trpAac promoter comprised of both trp promoter 
and lac operon sequences that is regulated by the lac repressor [Amann et al (1983) Gene 25:167; de Boer et al 
(1983) Proc. Natl Acad. Sci 80:21]. Furthermore, a bacterial promoter can include naturally occurring 
30 promoters of non-bacterial origin that have the ability to bind bacterial RNA polymerase and initiate 
transcription. A naturally occurring promoter of non-bacterial origin can also be coupled with a compatible RNA 
polymerase to produce high levels of expression of some genes in prokaryotes. The bacteriophage T7 RNA 
polymerase/promoter system is an example of a coupled promoter system [Studier et al (1986) /. Mol Biol 
J 89:1 13; Tabor et al (1985) Proc Natl Acad. Sci. 52:1074]. In addition, a hybrid promoter can also be 
35 comprised of a bacteriophage promoter and an E. coli operator region (EPO-A-0 267 851). 

In addition to a functioning promoter sequence, an efficient ribosome binding site is also useful for the 
expression of foreign genes in prokaryotes. In E. coli, the ribosome binding site is called the Shine-Daigarno 
(SD) sequence and includes an initiation codon (ATG) and a sequence 3-9 nucleotides in length located 3-11 
nucleotides upstream of the initiation codon [Shine et al (1975) Nature 254:34], The SD sequence is thought to 
40 promote binding of mRNA to the ribosome by the pairing of bases between theSD sequence and the 3' and of E. 
coli 16S rRNA [Steitzef at. (1979) "Genetic signals and nucleotide sequences in messenger RNA." In Biological 
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Regulation and Development: Gene Expression (ed. R.F. Goldbcrgcr)]. To express eukaryotic genes and 
prokaryotic genes with weak ribosome-binding site [Sambrook et al (1989) "Expression of cloned genes in 
Escherichia coli." In Molecular Cloning: A Laboratory Manual], 

A DNA molecule may be expressed intracellular^. A promoter sequence may be directly linked with the DNA 
5 molecule, in which case the first amino acid at the N-terminus will always be a methionine, which is encoded by 
the ATG start codon* If desired, methionine at the N-terminus may be cleaved from the protein by in vitro 
incubation with cyanogen bromide or by either in vivo on in vitro incubation with a bacterial methionine N- 
terminal peptidase (EPO-A-0 219 237). 

Fusion proteins provide an alternative to direct expression. Usually, a DNA sequence encoding the N-terminal 
10 portion of an endogenous bacterial protein, or other stable protein, is fused to the 5' end of heterologous coding 
sequences. Upon expression, this construct will provide a fusion of the two amino acid sequences. For example, 
the bacteriophage lambda cell gene can be linked at the 5' terminus of a foreign gene and expressed in bacteria. 
The resulting fusion protein preferably retains a site for a processing enzyme (factor Xa) to cleave the 
bacteriophage protein from the foreign gene [Nagai et al. (1984) Nature 509:810]. Fusion proteins can also be 
15 made with sequences from the lad [Jia et al (1987) Gene 50:197], trpE [Allen et al. (1987) J. Biotechnol. 5:93; 
Makoff et al (1989) J. Gen. Microbiol 135:Ul and Chey [EP-A-0 324 647] genes. The DNA sequence at the 
junction of the two amino acid sequences may or may not encode a cleavable site. Another example is a 
ubiquitin fusion protein. Such a fusion protein is made with the ubiquitin region that preferably retains a site for 
a processing enzyme (e.g. ubiquitin specific processing-protease) to cleave the ubiquitin from the foreign 
20 protein. Through this method, native foreign protein can be isolated [Miller etal (1989) Bio/Technology 7:698]. 

Alternatively, foreign proteins can also be secreted from the cell by creating chimeric DNA molecules that 
encode a fusion protein comprised of a signal peptide sequence fragment that provides for secretion of the 
foreign protein in bacteria [US patent 4,336,336]. The signal sequence fragment usually encodes a signal peptide 
comprised of hydrophobic amino acids which direct the secretion of the protein from the cell. The protein is 
25 either secreted into the growth media (gram-positive bacteria) or into the periplasmic space, located between the 
inner and outer membrane of the cell (gram -negative bacteria). Preferably there are processing sites, which can 
be cleaved either in vivo or in vitro encoded between the signal peptide fragment and the foreign gene. 

DNA encoding suitable signal sequences can be derived from genes for secreted bacterial proteins, such as the 
E. coli outer membrane protein gene (ompA) (Masui et al (1983), in: Experimental Manipulation of Gene 
30 Expression; Ghrayeb et al (1984) EMBO J. 5:2437] and the E. coli alkaline phosphatase signal sequence (phoA) 
[Oka et al (1985) Proc. Natl Acad, Sci. 82:7212], As an additional example, the signal sequence of the alpha- 
amylase gene from various Bacillus strains can be used to secrete heterologous proteins from B. subtilis [Palva 
et al (1982) Proc. Natl Acad. Sci. USA 79:5582; EP-A-0 244 042]. 

Usually, transcription termination sequences recognized by bacteria are regulatory regions located 3' to the 
35 translation stop codon, and thus together with the promoter flank the coding sequence. These sequences direct 
the transcription of an mRNA which can be translated into the polypeptide encoded by the DNA. Transcription 
termination sequences frequently include DNA sequences of about 50 nucleotides capable of forming stem loop 
structures that aid in terminating transcription. Examples include transcription termination sequences derived 
from genes with strong promoters, such as the trp gene in E. coli as well as other biosynthetic genes. 
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Usually, the above described components, comprising a promoter, signal sequence (if desired), coding sequence 
of interest, and transcription termination sequence, are put together into expression constructs. Expression 
constructs are often maintained in a replicon, such as an extrachromosomal element (e.g. plasmids) capable of 
stable maintenance in a host, such as bacteria. The replicon will have a replication system, thus allowing it to be 
5 maintained in a prokaryotic host either for expression or for cloning and amplification. In addition, a replicon 
may be either a high or low copy number plasmid. A high copy number plasmid will generally have a copy 
number ranging from about 5 to about 200, and usually about 10 to about 150. A host containing a high copy 
number plasmid will preferably contain at least about 10, and more preferably at least about 20 plasmids. Either 
a high or low copy number vector may be selected, depending upon the effect of the vector and the foreign 
10 protein on the host. 

Alternatively, the expression constructs can be integrated into the bacterial genome with an integrating vector. 
Integrating vectors usually contain at least one sequence homologous to the bacterial chromosome that allows 
the vector to integrate. Integrations appear to result from recombinations between homologous DNA in the 
vector and the bacterial chromosome. For example, integrating vectors constructed with DNA from various 
15 Bacillus strains integrate into the Bacillus chromosome (EP-A- 0 127 328). Integrating vectors may also be 
comprised of bacteriophage or transposon sequences. 

Usually, extrachromosomal and integrating expression constructs may contain selectable markers to allow for 
the selection of bacterial strains that have been transformed, Selectable markers can be expressed in the bacterial 
host and may include genes which render bacteria resistant to drugs such as ampicillin, chloramphenicol, 
20 erythromycin, kanamycin (neomycin), and tetracycline [Davies et al (1978) Annu. Rev. Microbiol 32:469]. 
Selectable markers may also include biosynthetic genes, such as those in the histidine, tryptophan, and leucine 
biosynthetic pathways. 

Alternatively, some of the above described components can be put together in transformation vectors. 
Transformation vectors are usually comprised of a selectable market that is either maintained in a replicon or 
25 developed into an integrating vector, as described above. 

Expression and transformation vectors, either extra-chromosomal replicons or integrating vectors, have been 
developed for transformation into many bacteria. For example, expression vectors have been developed for, inter 
alia, the following bacteria: Bacillus subtilis [Paiva et al (1982) Proc. Natl Acad. Scl USA 79:5582; EP-A-0 
036 259 and EP-A-0 063 953; WO 84/04541], Escherichia coli [Shimatake et al. (1981) Nature 292:128; Amann 
30 et al. (1985) Gene 40:183; Studier et al (1986) /. Moi Biol 789:1 13; EP-A-0 036 776JEP-A-0 136 829 and EP- 
A-0 136 907], Streptococcus cremoris [Powell etal (1988) Appi Environ, Microbiol 54:655]; Streptococcus 
lividans [Powell et al (1988) Appl Environ, Microbiol 54:655], Streptomyces lividans [US patent 4,745,056]. 

Methods of introducing exogenous DNA into bacterial hosts are well-known in the art, and usually include 
either the transformation of bacteria treated with CaCl 2 or other agents, such as divalent cations and DMSO. 

35 DNA can also be introduced into bacterial cells by electroporation. Transformation procedures usually vary with 
the bacterial species to be transformed. See e.g. [Masson etal (1989) FEMS Microbiol Lett. 60:273; Palva etal 
(1982) Proc. Natl Acad. Sci. USA 79:5582; EP-A-0 036 259 and EP-A-0 063 953; WO 84/04541, Bacillus], 
[Miller etal (1988) Proc. Natl Acad. Sci. 85:856; Wang et al (1990) J. Bacteriol J72:949, Campylobacter], 
[Cohen et al (1973) Proc. Natl Acad. Scl 69:21 10; Dower et al (1988) Nucleic Acids Res. 76:6127; Kushner 

40 (1978) "An improved method for transformation of Escherichia coli with ColEl-derived plasmids. In Genetic 
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Engineering: Proceedings of the International Symposium on Genetic Engineering (eds. H.W. Boyer and S. 
Nicosia); Mandel et al (1970) J. Mol Biol 55:159; Taketo (1988) Biochinu Biophys. Acta 949:318; 
Escherichia], [Chassy et al (1987) FEMS Microbiol Lett. 44:173 Lactobacillus]; [Fiedler et al (1988) Anal 
Biochem 770:38, Pseudomonas]; [Augustin et al (1990) FEMS Microbiol Lett. 66:203, Staphylococcus], 
[Barany et al (1980) J. Bacteriol 144:69%; Harlander (1987) "Transformation of Streptococcus lactis by 
electroporation, in: Streptococcal Genetics (ed. J. Ferretti and R. Curtiss HI); Perry et al (1981) Infect. Immun. 
32:1295; Powell et al (1988) Appl Environ. Microbiol 54:655; Somkuti et al (1987) Proc. 4th Evr. Cong. 
Biotechnology 7:412, Streptococcus]. 

v. Yeast Expression 

Yeast expression systems are also known to one of ordinary skill in the art. A yeast promoter is any DNA 
sequence capable of binding yeast RNA polymerase and initiating the downstream (V) transcription of a coding 
sequence {e.g. structural gene) into mRNA. A promoter will have a transcription initiation region which is 
usually placed proximal to the 5' end of the coding sequence. This transcription initiation region usually includes 
an RNA polymerase binding site (the "TATA Box") and a transcription initiation site. A yeast promoter may 
15 also have a second domain called an upstream activator sequence (UAS), which, if present, is usually distal to 
the structural gene. The UAS permits regulated (inducible) expression. Constitutive expression occurs in the 
absence of a UAS. Regulated expression may be either positive or negative, thereby either enhancing or 
reducing transcription. 

Yeast is a fermenting organism with an active metabolic pathway, therefore sequences -encoding enzymes in the 
20 metabolic pathway provide particularly useful promoter sequences. Examples include alcohol dehydrogenase 
(ADH) (EP-A-0 284 044), enolase, glucokinase, glucose-6-phosphate isomerase, glyceraldehyde-3-phosphate- 
dehydrogenase (GAP or GAPDH), hexokinase, phosphofructokinase, 3-phosphoglycerate mutase, and pyruvate 
kinase (PyK) (EPO-A-0 329 203). The yeast PH05 gene, encoding acid phosphatase, also provides useful 
promoter sequences [Myanohara et al (1983) Proc. Natl Acad. Sci. USA 80:1], 

25 In addition, synthetic promoters which do not occur in nature also function as yeast promoters. For example, 
UAS sequences of one yeast promoter may be joined with the transcription activation region of another yeast 
promoter, creating a synthetic hybrid promoter. Examples of such hybrid promoters include the ADH regulatory 
sequence linked to the GAP transcription activation region (US Patent Nos. 4,876,197 and 4,880,734). Other 
examples of hybrid promoters include promoters which consist of the regulatory sequences of either the ADH2, 

30 GAL4 f GAL10, OR PH05 genes, combined with the transcriptional activation region of a glycolytic enzyme 
gene such as GAP or PyK (EP-A-0 164 556). Furthermore, a yeast promoter can include naturally occurring 
promoters of non-yeast origin that have the ability to bind yeast RNA polymerase and initiate transcription. 
Examples of such promoters include, inter alia, [Cohen et al (1980) Proc. Natl Acad. Sci. USA 77:1078; 
Henikoff et al. (1981) Nature 2SJ:835; Hollenberg et al. (1981) Curr. Topics Microbiol. Immunol 96:119; 

35 Hollenberg et al. (1979) "The Expression of Bacterial Antibiotic Resistance Genes in the Yeast Saccharomyces 
cerevisiae," in: Plasmids of Medical, Environmental and Commercial Importance (eds. K.N. Timmis and A. 
Puhler); Mercerau-Puigalon et al (1980) Gene 77:163; Panthier et al (1980) Curr. Genet. 2:109;]. 

A DNA molecule may be expressed intracellularly in yeast A promoter sequence may be directly linked with 
the DNA molecule, in which case the first amino acid at the N-terminus of the recombinant protein will always 
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be a methionine, which is encoded by the ATG start codon. If desired, methionine at the N-terminus may be 
cleaved from the protein by in vitro incubation with cyanogen bromide. 

Fusion proteins provide an alternative for yeast expression systems, as well as in mammalian, baculovirus, and 
bacteria] expression systems. Usually, a DNA sequence encoding the N-terminal portion of an endogenous yeast 
5 protein, or other stable protein, is fused to the 5' end of heterologous coding sequences. Upon expression, this 
construct will provide a fusion of the two amino acid sequences. For example, the yeast or human superoxide 
dismutase (SOD) gene, can be linked at the 5 1 terminus of a foreign gene and expressed in yeast. The DNA 
sequence at the junction of the two amino acid sequences may or may not encode a cleavable site. See e.g. EP- 
A-0 196 056. Another example is a ubiquitin fusion protein. Such a fusion protein is made with the ubiquitin 
10 region that preferably retains a site for a processing enzyme [e.g. ubiquitin-specific processing protease) to 
cleave the ubiquitin from the foreign protein. Through this method, therefore, native foreign protein can be 
isolated {e.g. W 088/024066). 

Alternatively, foreign proteins can also be secreted from the ceil into the growth media by creating chimeric 
DNA molecules that encode a fusion protein comprised of a leader sequence fragment that provide for secre aon 
15 in yeast of the foreign protein. Preferably, there are processing sites encoded between the leader fragment and 
the foreign gene that can be cleaved either in vivo or in vitro. The leader sequence fragment usually encodes a 
signal peptide comprised of hydrophobic amino acids which direct the secretion of the protein from the cell. 

DNA encoding suitable signal sequences can be derived from genes for secreted yeast proteins, such as the 
genes for invertase (EP-A -0012873; JPO 62,096,086) and A-factor (US patent 4,588,684). Alternatively, leaders 
20 of non-yeast origin exit, such as an interferon leader, that also provide for secretion in yeast (EP-A-0060057). 

A preferred class of secretion leaders are those that employ a fragment of the yeast alpha-factor gene, which 
contains both a "pre" signal sequence, and a "pro" region. The types of alpha-factor fragments that can be 
employed include the full-length pre-pro alpha factor leader (about 83 amino acid residues) as well as truncated 
alpha-factor leaders (usually about 25 to about 50 amino acid residues) (US Patents 4,546,083 and 4,870,008; 
25 EP-A-0 324 274). Additional leaders employing an alpha-factor leader fragment that provides for secretion 
include hybrid alpha-factor leaders made with a presequence of a first yeast, but a pro-region from a second 
yeast alphafactor. (e.g. see WO 89/02463.) 

Usually, transcription termination sequences recognized by yeast are regulatory regions located 3' to the 
translation stop codon, and thus together with the promoter flank the coding sequence. These sequences direct 
30 the transcription of an mRNA which can be translated into the polypeptide encoded by the DNA. Examples of 
transcription terminator sequence and other yeast-recognized termination sequences, such as those coding for 
glycolytic enzymes. 

Usually, the above described components, comprising a promoter, leader (if desired), coding sequence of 
interest, and transcription termination sequence, are put together into expression constructs. Expression 

35 constructs are often maintained in a replicon, such as an extrachromosomal element (e.g. piasmids) capable of 
stable maintenance in a host, such as yeast or bacteria. The replicon may have two replication systems, thus 
allowing it to be maintained, for example, in yeast for expression and in a prokaryotic host for cloning and 
amplification. Examples of such yeast-bacteria shuttle vectors include YEp24 [Botstein et al. (1979) Gene 8:17- 
24], pCI/1 [Brake et al (1984) Proc. Natl. Acad. Sci USA 57:4642-4646], and YRpl7 [Stinchcomb et al. (1982) 

40 J. M°l< 755:157]. In addition, a replicon may be either a high or low copy number plasmid. A high copy 



BNSDOCID: <WO O2026O6A2J_> 



WO 02/02606 



-18- 



PCT/IB01/01445 



number plasmid will generally have a copy number ranging from about 5 to about 200, and usually about 10 to 
about 150. A host containing a high copy number plasmid will preferably have at least about 10, and more 
preferably at least about 20. Enter a high or low copy number vector may be selected, depending upon the effect 
of the vector and the foreign protein on the host. See e.g. Brake et al, supra, 

5 Alternatively, the expression constructs can be integrated into the yeast genome with an integrating vector. 
Integrating vectors usually contain at least one sequence homologous to a yeast chromosome that allows the 
vector to integrate, and preferably contain two homologous sequences flanking the expression construct. 
Integrations appear to result from recombinations between homologous DNA in the vector and the yeast 
chromosome [Orr-Weaver et al (1983) Methods in Enzymol 707:228-245]. An integrating vector may be 

10 directed to a specific locus in yeast by selecting the appropriate homologous sequence for inclusion in the vector. 
See Orr-Weaver et al, supra. One or more expression construct may integrate, possibly affecting levels of 
recombinant protein produced [Rine et al (1983) Proc. Natl Acad, Scl USA 80:6750]. The chromosomal 
sequences included in the vector can occur either as a single segment in the vector, which results in the integra- 
tion of the entire vector, or two segments homologous to adjacent segments in the chromosome and flanking the 

15 expression construct in the vector, which can result in the stable integration of only the expression construct. 

Usually, extrachromosomal and integrating expression constructs may contain selectable markers to allow for 
the selection of yeast strains that have been transformed. Selectable markers may include biosynthetic genes that 
can be expressed in the yeast host, such as ADE2, H1S4, LEU2 t TRP1, and ALG7> and the G418 resistance gene, 
which confer resistance in yeast cells to tunicamycin and G418, respectively. In addition, a suitable selectable 
20 marker may also provide yeast with the ability to grow in the presence of toxic compounds, such as metal. For 
example, the presence of CUP] allows yeast to grow in the presence of copper ions [Butt et al (1987) 
Microbiol Rev, 57:351]. 

Alternatively, some of the above described components can be put together into transformation vectors. 
Transformation vectors are usually comprised of a selectable marker that is either maintained in a replicon or ' 
25 developed into an integrating vector, as described above. 

Expression and transformation vectors, either extrachromosomal replicons or integrating vectors, have been 
developed for transformation into many yeasts. For example, expression vectors have been developed for, inter 
alia, the following yeasts:Candida albicans [Kurtz, et al (1986) Mol Cell Biol 6:142], Candida maltosa 
[Kunze, et al (1985) J, Basic Microbiol 25:141]. Hansenula polymorpha [Gleeson, et al (1986) I Gen, 

30 Microbiol 752:3459; Roggenkamp et al (1986) Mol Gen, Genet, 202:302], Kluyveromyces fragilis [Das, et al 
(1984)/. Bacteriol 755:1165], Kluyveromyces lactis [De Louvencourt et al (1983) J. Bacteriol 754:737; Van 
den Berg et al (1990) Bio/Technology 5:135], Pichia guillerimondii [Kunze et al, (1985) 7. Basic Microbiol 
25:141], Pichia pastoris [Cregg, et al (1985) Mol Cell Biol 5:3376; US Patent Nos. 4,837,148 and 4,929,555], 
Saccharomyces cerevisiae [Hinnen et al (1978) Proc. Natl Acad. Scl USA 75:1929; Ito et al (1983) /. 

35 Bacteriol 755:163], Schizosaccharomyces pombe [Beach and Nurse (1981) Nature 300:106], and Yarrowia 
lipolytica [Davidow, etal (1985) Curr. Genet 70:380471 Gaiilardin, et al (1985) Curr. Genet, 70:49]. 

Methods of introducing exogenous DNA into yeast hosts are well-known in the art, and usually include either 
the transform. iion of spheroplasts or of intact yeast cells treated with alkali cations. Transformation procedures 
usually vary with the yeast species to be transformed. See e.g. [Kurtz et al (1986) Mol Cell, Biol 6:142; Kunze 
40 et al (1985) J, Basic Microbiol 25:141; Candida]; [Gleeson et al (1986) /. Gen, Microbiol 752:3459; 
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Roggenkamp et al (1986) Moi Gen. Genet. 202:302; Hansenula]; [Das et al (1984) J. Bacterial. 158:1165] De 
Louvencourt et al (1983) /. Bacterial. 154:1165; Van den Berg et al. (1990) Bio/Technology 5:135; 
Kluyveromyces]; [Cregg et al (1985) Moi Cell Biol 5:3376; Kunze et al (1985) J. Basic Microbiol. 25:141; 
US Patents 4,837,348 k 4,929,555; Pichia]; [Hinnen et al. (1978) Proc. Natl Acad. ScL USA 75;1929; Ito et al. 
5 (1983) /. Bacterial 7JJ:I63 Saccharomyces]; [Beach & Nurse (1981) Nature 300:706; Schizosaccharomyces]; 
[Davidow et al (1985) Curr. Genet. 70:39; Gaillardin et al (1985) Curr. Genet 70:49; Yarrowia]. 
Pharmaceutical Compo sitio n s 

Pharmaceutical compositions can comprise polypeptides and/or nucleic acid of the invention. The 
pharmaceutical compositions will comprise a therapeutically effective amount of either polypeptides, antibodies, 
10 or polynucleotides of the claimed invention. 

The term "therapeutically effective amount" as used herein refers to an amount of a therapeutic agent to treat, 
ameliorate, or prevent a desired disease or condition, or to exhibit a detectable therapeutic or preventative effect. 
The effect can be detected by, for example, chemical markers or antigen levels. Therapeutic effects also include 
reduction in physical symptoms, such as decreased body temperature. The precise effective amount for a subject 
15 will depend upon the subject's size and health, the nature and extent of the condition, and the therapeutics or 
combination of therapeutics selected for administration. Thus, it is not useful to specify an exact effective 
amount in advance. However, the effective amount for a given situation can be determined by routine 
experimentation and is within the judgement of the clinician. 

For purposes of the present invention, an effective dose will be from about 0.01 mg/ kg to 50 mg/kg or 0.05 
20 mg/kg to about 10 mg/kg of the DNA constructs in the individual to which it is administered. 

A pharmaceutical composition can also contain a pharmaceutically acceptable carrier. The term 
"pharmaceutical^ acceptable carrier" refers to a carrier for administration of a therapeutic agent, such as 
antibodies or a polypeptide, genes, and other therapeutic agents. The term refers to any pharmaceutical carrier 
that does not itself induce the production of antibodies harmful to the individual receiving the composition, and 
25 which may be administered without undue toxicity. Suitable carriers may be large, slowly metabolized 
macromolecules such as proteins, polysaccharides, polylactic acids, polyglycoiic acids, polymeric amino acids, 
amino acid copolymers, and inactive virus particles. Such carriers are well known to those of ordinary skill in 
the art. 

Pharmaceutically acceptable salts can be used therein, for example, mineral acid salts such as hydrochlorides, 
30 hydrobromides, phosphates, sulfates, and the like; and the salts of organic acids such as acetates, propionates, 
malonates, benzoates, and the like, A thorough discussion of pharmaceutically acceptable excipients is available 
in Remington's Pharmaceutical Sciences (Mack Pub, Co., N.J. 1991), 

Pharmaceutically acceptable carriers in therapeutic compositions may contain liquids such as water, saline, 
glycerol and ethanol. Additionally, auxiliary substances, such as wetting or emulsifying agents, pH buffering 
35 substances, and the like, may be present in such vehicles. Typically, the therapeutic compositions are prepared as 
injectables, either as liquid solutions or suspensions; solid forms suitable for solution in, or suspension in, liquid 
vehicles prior to injection may also be prepared. Liposomes are included within the definition of a 
pharmaceutically acceptable carrier. 
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Delivery Methods 

Once formulated, the compositions of the invention can be administered directly to the subject. The subjects to 
be treated can be animals; in particular, human subjects can be treated. 

Direct delivery of the compositions will generally be accomplished by injection, either subcutaneously, 
5 intraperitoneally, intravenously or intramuscularly or delivered to the interstitial space of a tissue. The 
compositions can also be administered into a lesion. Other modes of administration include oral and pulmonary 
administration, suppositories, and transdermal or transcutaneous applications (e.g. see WO98/20734), needles, 
and gene guns or hyposprays. Dosage treatment may be a single dose schedule or a multiple dose schedule. 
Vaccines 

10 Vaccines according to the invention may either be prophylactic (ie. to prevent infection) or therapeutic (ie. to 
treat disease after infection). 

Such vaccines comprise immunising antigen(s), immunogen(s), polypeptide(s), protein(s) or nucleic acid, 
usually in combination with "pharmaceutical^ acceptable carriers/' which include any carrier that does not itself 
induce the production of antibodies harmful to the individual receiving the composition. Suitable carriers are 

15 typically large, slowly metabolized macromolecules such as proteins, polysaccharides, polylactic acids, 
polyglycolic acids, polymeric amino acids, amino acid copolymers, lipid aggregates (such as oil droplets or 
liposomes), and inactive virus particles. Such carriers are well known to those of ordinary skill in the art. 
Additionally, these carriers may function as immunostimulating agents ("adjuvants"). Furthermore, the antigen 
or immunogen may be conjugated to a bacterial toxoid, such as a toxoid from diphtheria, tetanus, cholera, H. 

20 pylori, etc. pathogens. 

Preferred adjuvants to enhance effectiveness of the composition include, but are not limited to: (1) aluminum 
salts (alum), such as aluminum hydroxide, aluminum phosphate, aluminum sulfate, etc; (2) oil-in-water 
emulsion formulations (with or without other specific immunostimulating agents such as muramyl peptides (see 
below) or bacterial cell wall components), such as for example (a) MF59™ (WO 90/14837; Chapter 10 in 

25 Vaccine design: the subunit and adjuvant approach, eds. Powell & Newman, Plenum Press 1995), containing 
5% Squalene, 0.5% Tween 80, and 0.5% Span 85 (optionally containing various amounts of MTP-PE (see 
below), although not required) formulated into submicron particles using a microfluidizer such as Model HOY 
microfiuidizer (Micro fluidics, Newton, MA), (b) SAF, containing 10% Squaiane, 0.4% Tween 80, 5% pluronic- 
blocked polymer LI 21 , and thr-MDP (see below) either micro fluidized into a submicron emulsion or vortexed to 

30 generate a larger particle size emulsion, and (c) Ribi™ adjuvant system (RAS), (Ribi Immunochem, Hamilton, 
MT) containing 2% Squalene, 0.2% Tween 80, and one or more bacterial cell wall components from the group 
consisting of monophosphorylipid A (MPL), trehalose dimycolate (TDM), and cell wall skeleton (CWS), 
preferably MPL + CWS (Detox™); (3) saponin adjuvants, such as Stimulon™ (Cambridge Bioscience, 
Worcester, MA) may be used or particles generated therefrom such as ISCOMs (immunostimulating 

35 complexes); (4) Complete Freund's Adjuvant (CFA) and Incomplete Freund's Adjuvant (IFA); (5) cytokines, 
such as interleukins (e.g, IL-1, IL-2, 1L-4, IL-5, IL-6, IL-7, IL42, etc.), interferons {e.g. gamma interferon), 
macrophage colony stimulating factor (M-CSF), tumor necrosis factor (TNF), etc; and (6) other substances that 
act as immunostimulating agents to enhance the effectiveness of the composition. Alum and MF59™ are 
preferred. 
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As mentioned above, muramyl peptides include, but are not limited to, N-acetyl-muramyl-L-threonyl-D- 
isoglutamine (thr-MDP), N-acetyl-normuramyl-L-alanyl-D-isoglutamine (nor-MDP), N-acetylmuramyl-L-alanyl- 
D-isoglutaminyl-L-alanine-2-(] '^'-dipalmitoyl-j/i-glycero-S-hydroxyphosphoryloxyJ-ethylamine (MTP-PE), etc. 

The immunogenic compositions (e.g. the immunising antigen/immunogen/polypeptide/protein/ nucleic acid, 
5 pharmaceutical acceptable carrier, and adjuvant) typically will contain diluents, such as water, saline, glycerol, 
ethanol, etc. Additionally, auxiliary substances, such as wetting or emulsifying agents, pH buffering substances, 
and the like, may be present in such vehicles. 

Typically, the immunogenic compositions are prepared as injectables, either as liquid solutions or suspensions; 
solid forms suitable for solution in, or suspension in, liquid vehicles prior to injection may also be prepared. The 
10 preparation also may be emulsified or encapsulated in liposomes for enhanced adjuvant effect, as discussed 
above under pharmaceutical^ acceptable carriers. 

Immunogenic compositions used as vaccines comprise an immunologically effective amount of the antigenic or 
immunogenic polypeptides, as well as any other of the above-mentioned components, as needed, By 
"immunologically effective amount", it is meant that the administration of that amount to an individual, either in 

15 a single dose or as part of a series, is effective for treatment or prevention. This amount varies depending upon 
the health and physical condition of the individual to be treated, the taxonomic group of individual to be treated 
(e.g. nonhuman primate, primate, etc.), the capacity of the individual's immune system to synthesize antibodies, 
the degree of protection desired, the formulation of the vaccine, the treating doctor's assessment of the medical 
situation, and other relevant factors. It is expected that the amount will fall in a relatively broad range that can be 

20 determined through routine trials. 

The immunogenic compositions are conventionally administered parenterally, e.g. by injection, either subcutan- 
eously, intramuscularly, or transdermally/transcutaneously (e.g. W 098/20734). Additional formulations suitable 
for other modes of administration include oral and pulmonary formulations, suppositories, and transdermal 
applications. Dosage treatment may be a single dose schedule or a multiple dose schedule. The vaccine may be 
25 administered in conjunction with other immunoregulatory agents. 

As an alternative to protein-based vaccines, DNA vaccination may be employed [e.g. Robinson & Torres (1997) 
Seminars in Immunology 9:271-283; Donnelly et al (1997) Annu Rev Immunol 15:617-648; see later herein]. 
Gene Delivery Vehicles 

Gene therapy vehicles for delivery of constructs including a coding sequence of a therapeutic of the invention, to 
30 be delivered to the mammal for expression in the mammal, can be administered either locally or systemically. 
These constructs can utilize viral or non-viral vector approaches in in vivo or ex vivo modality. Expression of 
such coding sequence can be induced using endogenous mammalian or heterologous promoters. Expression of 
the coding sequence in vivo can be either constitutive or regulated. 

The invention includes gene delivery vehicles capable of expressing the contemplated nucleic acid sequences. 
35 The gene delivery vehicle is preferably a viral vector and, more preferably, a retroviral, adenoviral, 
adeno-associated viral (AAV), herpes viral, or alphavirus vector. The viral vector can also be an astrovirus, 
coronavirus, orthomyxovirus, papovavirus, paramyxovirus, parvovirus, picornavirus, poxvirus* or togavirus viral 
vector. See generally, Jolly (1994) Cancer Gene Therapy 1:51-64; Kimura (1994) Human Gene Therapy 
5:845-852; Connelly (1995) Human Gene Therapy 6:185-193; and Kaplitt (1994) Nature Genetics 6:148-153. 
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Retroviral vectors are well known in the art and we contemplate that any retroviral gene therapy vector is 
employable in the invention, including B, C and D type retroviruses, xenotropic retroviruses (for example, 
NZB-X1, NZB-X2 and NZB9-1 (see O'Neill (1985) J. Virol, 53:160) polytropic retroviruses e.g. MCF and 
MCF-MLV (see Kelly (1983) J. Virol 45:291), spumaviruses and lentiviruses. See RNA Tumor Viruses, 
5 Second Edition, Cold Spring Harbor Laboratory, 1985. 

Portions of the retroviral gene therapy vector may be derived from different retroviruses, For example, 
retrovector LTRs may be derived from a Murine Sarcoma Virus, a tRNA binding site from a Rous Sarcoma 
Virus, a packaging signal from a Murine Leukemia Virus, and an origin of second strand synthesis from an 
Avian Leukosis Virus. 

10 These recombinant retroviral vectors may be used to generate transduction competent retroviral vector particles 
by introducing them into appropriate packaging cell lines (see US patent 5,591,624). Retrovirus vectors can be 
constructed for site-specific integration into host cell DNA by incorporation of a chimeric integrase enzyme into 
the retroviral particle (see W096/37626). It is preferable that the recombinant viral vector is a replication 
defective recombinant virus. 

15 Packaging cell lines suitable for use with the above-described retrovirus vectors are well known in the art, are 
readily prepared (see WO95/30763 and WO92/05266), and can be used to create producer cell lines (also termed 
vector cell lines or "VCLs") for the production of recombinant vector particles. Preferably, the packaging cell 
lines are made from human parent cells (e.g. HT1080 cells) or mink parent cell lines, which eliminates 
inactivation in human serum. 

20 Preferred retroviruses for the construction of retroviral gene therapy vectors include Avian Leukosis Virus, 
Bovine Leukemia, Virus, Murine Leukemia Virus, Mink-Cell Focus-Inducing Virus, Murine Sarcoma Virus, 
Reticuloendotheliosis Virus and Rous Sarcoma Virus. Particularly preferred Murine, Leukemia Viruses include 
4070A and 1504A (Hartley and Rowe (1976) J Virol 19:19-25), Abelson (ATCC No. VR-999), Friend (ATCC 
No. VR-245), Graffi, Gross (ATCC Nol VR-590), Kirsten, Harvey Sarcoma Virus and Rauscher (ATCC No. 

25 VR-998) and Moloney Murine Leukemia Virus (ATCC No. VR-190). Such retroviruses may be obtained from 
depositories or collections such as the American Type Culture Collection ("ATCC") in Rockville, Maryland or 
isolated from known sources using commonly available techniques. 

Exemplary known retroviral gene therapy vectors employable in this invention include those described in patent 
applications GB2200651, EP0415731, EP0345242, BP0334301, WO89/02468; WO89/05349, WO89/09271, 

30 WO90/02806, WO90/07936, WO94/03622, W093/25698, W093/25234, WO93/11230, W093/102I8, 
WO91/02805, WO91/02825, WO95/07994, US 5,219,740, US 4,405,712, US 4,861,719, US 4,980,289, US 
4,777,127, US 5,591,624. See also Vile (1993) Cancer Res 53:3860-3864; Vile (1993) Cancer Res 53:962-967; 
Ram (1993) Cancer Res 53 (1993) 83-88; Takamiya (1992) J Neurosci Res 33:493-503; Baba (1993) 7 
Neurosur$ 79:729-735; Mann (1983) Cell 33:153; Cane (1984) Proc Natl Acad Sci 81:6349; and Miller (1990) 

35 Human Gene Therapy 1 . 

Human adenoviral gene therapy vectors are also known in the art and employable in this invention. See, for 
example, Berkner (1988) Biotechniques 6:616 and Rosenfeld (1991) Science 252:431, and WO93/07283, 
WO93/06223, and WO93/07282. Exemplary known adenoviral gene therapy vectors employable in this 
invention include those described in the above referenced documents and in W094/12649, WO93/03769, 
40 W093/19191, W094/28938, W095/11984, WO95/00655, WO95/27071, W095/29993, W095/34671, 
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WO96/05320, WO94/08026, WO94/11506, WO93/06223, W094/24299, WO95/14102, W095/24297, 
WO95/02697, W094/28152, W094/24299, W095/09241, WO95/25807, WO95/05835, W094/18922 and 
WO95/09654. Alternatively, administration of DNA linked to killed adenovirus as described in Curiel (1992) 
Bum. Gene Ther. 3:147-154 may be employed. The gene delivery vehicles of the invention also include 
5 adenovirus associated virus (AAV) vectors. Leading and preferred examples of such vectors for use in this 
invention are the AAV-2 based vectors disclosed in Srivastava, WO93/09239. Most preferred AAV vectors 
comprise the two AAV inverted terminal repeats in which the native D-sequences are modified by substitution 
of nucleotides, such that at least 5 native nucleotides and up to 18 native nucleotides, preferably at least 10 
native nucleotides up to 18 native nucleotides, most preferably 10 native nucleotides are retained and the 

10 remaining nucleotides of the D-sequence are deleted or replaced with non-native nucleotides. The native 
D-sequences of the AAV inverted terminal repeats are sequences of 20 consecutive nucleotides in each AAV 
inverted terminal repeat (ie. there is one sequence at each end) which are not involved in HP formation. The 
non-native replacement nucleotide may be any nucleotide other than the nucleotide found in the native 
D-sequence in the same position. Other employable exemplary AAV vectors are pWP-19, pWN-1, both of 

15 which are disclosed in Nahreini (199J) Gene 124:257-262. Another example of such an AAV vector is psub201 
(see Samulski (1987) 7. Virol 61:3096). Another exemplary AAV vector is the Double-D ITR vector. 
Construction of the Double-D ITR vector is disclosed in US Patent 5,478,745. Still other vectors are those 
disclosed in Carter US Patent 4,797,368 and Muzyczka US Patent 5,139,941, Chartejee US Patent 5,474,935, 
and Kotin W094/288157. Yet a further example of an AAV vector employable in this invention is 

20 SSV9AFABTKneo, which contains the AFP enhancer and albumin promoter and directs expression 
predominantly in the liver. Its structure and construction are disclosed in Su (1996) Human Gene Therapy 
7:463-470. Additional AAV gene therapy vectors are described in US 5,354,678, US 5,173,414, US 5,139,941, 
and US 5,252,479. 

The gene therapy vectors of the invention also include herpes vectors. Leading and preferred examples are 
25 herpes simplex virus vectors containing a sequence encoding a thymidine kinase polypeptide such as those 
disclosed in US 5,288,641 and EP0176170 (Roizman). Additional exemplary herpes simplex virus vectors 
include HFEM/ICP6-LacZ disclosed in WO95/04139 (Wistar), pHSVlac described in Geller (1988) Science 
241:1667-1669 and in WO90/09441 & WO92/07945, HSV Us3::pgC-lacZ described in Fink (1992) Human 
Gene Therapy 3:11-39 and HSV 7134, 2 RH 105 and GAL4 described in EP 0453242 (Breakefield), and those 
30 deposited with ATCC as accession numbers ATCC VR-977 and ATCC VR-260. 

Also contemplated are alpha virus gene therapy vectors that can be employed in this invention. Preferred alpha 
virus vectors are Sindbis viruses vectors. Togaviruses, Semiiki Forest virus (ATCC VR-67; ATCC VR-1247), 
Middleberg virus (ATCC VR-370), Ross River virus (ATCC VR-373; ATCC VR-1246), Venezuelan equine 
encephalitis virus (ATCC VR923; ATCC VR-1250; ATCC VR-1249; ATCC VR-532), and those described in 
35 US patents 5,091,309, 5,217,879, and WO92/10578. More particularly, those alpha virus vectors described in 
US Serial No. 08/405,627, filed March 15, 1 995,W 094/21 792, WO92/10578, WO95/07994, US 5,091,309 and 
US 5,217,879 are employable. Such alpha viruses may be obtained from depositories or collections such as the 
ATCC in Rockville, Maryland or isolated from known sources using commonly available techniques. 
Preferably, alphavirus vectors with reduced cytotoxicity are used (see USSN 08/679640). 

40 DNA vector systems such as eukaryotic layered expression systems are also useful for expressing the nucleic 
acids of the invention. See WO95/07994 for a detailed description of eukaryotic layered expression systems. 
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20 



Preferably, the eukaryotic layered expression systems of the invention are derived from alphavirus vectors and 
most preferably from Sindbis viral vectors. 

Other viral vectors suitable for use in the present invention include those derived from poliovirus, for example 
ATCC VR-58 and those described in Evans, Nature 339 (1989) 385 and Sabin (1973) J. Biol Standardization 
5 1:115; rhinovirus, for example ATCC VR-1 1 10 and those described in Arnold (1990) J Cell Biochem L401 ; pox 
•viruses such as canary pox virus or vaccinia virus, for example ATCC VR-1 11 and ATCC VR-2010 and those 
described in Fisher-Hoch (1989) Proc Natl Acad Sci 86:317; Flexner (1989) Ann NY Acad Sci 569:86, Flexner 
(1990) Vaccine 8:17; in US 4,603,112 and US 4,769,330 and W089/01973; SV40 virus, for example ATCC 
VR-305 and those described in Mulligan (1979) Nature 277:108 and Madzak (1992) J Gen Virol 73:1533; 
10 influenza virus, for example ATCC VR-797 and recombinant influenza viruses made employing reverse genetics 
techniques as described in US 5,166,057 and in Enami (1990) Proc Natl Acad Sci 87:3802-3805; Enami & 
Palese (1991) J Virol 65:271 3-2713 and Luytjes (1989) Cell 59:110, (see also McMichael (1983) NEJ Med 
309:13, and Yap (1978) Nature 273:238 and Nature (1979) 277:108); human immunodeficiency virus as 
described in EP-0386882 and in Buchschacl^r (1992) J. Virol. 66:2731; measles virus, for example ATCC 
15 VR-67 and VR-1247 and those described in EP-0440219; Aura virus, for example ATCC VR-368; Bebaru virus, 
for example ATCC VR-600 and ATCC VR-1 240; Cabassou virus, for example ATCC VR-922; Chikungunya 
virus, for example ATCC VR-64 and ATCC VR-1241; Fort Morgan Virus, for example ATCC VR-924; Getah 
virus, for example ATCC VR-369 and ATCC VR-1243; Kyzylagach virus, for example ATCC VR-927; Mayaro 
virus, for example ATCC VR-66; Mucambo virus, for example ATCC VR-580 and ATCC VR-1244; Ndumu 
virus, for example ATCC VR-371; Pixuna virus, for example ATCC VR-372 and ATCC VR-1245; Tonate 
virus, for example ATCC VR-925; Triniti virus, for example ATCC VR-469; Una virus, for example ATCC 
VR-374; Whataroa virus, for example ATCC VR-926; Y -62-33 virus, for example ATCC VR-375; O'Nyong 
virus, Eastern encephalitis virus, for example ATCC VR-65 and ATCC VR-1242; Western encephalitis virus, 
for example ATCC VR-70, ATCC VR-1251, ATCC VR-622 and ATCC VR-1252; and coronavirus, for 
25 example ATCC VR-740 and those described in Hamre (1966) Proc Soc Exp Biol Med 121:190. 

Delivery of the compositions of this invention into cells is not limited to the above mentioned viral vectors. 
Other delivery methods and media may be employed such as, for example, nucleic acid expression vectors, 
polycationic condensed DNA linked or unlinked to killed adenovirus alone, for example see US Serial No. 
08/366,787, filed December 30, 1994 and Curiel (1992) Hum Gene Tker 3:147-154 ligand linked DNA, for 

30 example see Wu (1989) J Biol Chem 264:1698546987, eucaryotic cell delivery vehicles cells, for example see 
US Serial No.08/240,030, filed May 9, 1994, and US Serial No. 08/404,796, deposition of photopolymerized 
hydrogel materials, hand-held gene transfer particle gun, as described in US Patent 5,149,655, ionizing radiation 
as described in US5,206,152 and in WO92/1 1033, nucleic charge neutralization or fusion with cell membranes. 
Additional approaches are described in Philip (1994) Mol Cell Biol 14:241 1-2418 and in Woffendin (1994) Proc 

35 A^f/^WSd 91:1581-1585. 

Particle mediated gene transfer may be employed, for example see US Serial No. 60/023,867. Briefly, the 
sequence can be inserted into conventional vectors that contain conventional control sequences for high level 
expression, and then incubated with synthetic gene transfer molecules such as polymeric DNA-binding cations 
like polylysine, protamine, and albumin, linked to cell targeting ligands such as asialoorosomucoid, as described 
40 in Wu & Wu (1987) J. Biol Chem. 262:4429-4432, insulin as described in Hucked (1990) Biochem Pharmacol 
40:253-263, galactose as described in Plank (1992) Bioconjugate Chem 3:533-539, lactose or transferrin. 
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Naked DNA may also be employed. Exemplary naked DNA introduction methods are described in W 090/11092 
and US 5,580,859. Uptake efficiency may be improved using biodegradable latex beads. DNA coated latex 
beads are efficiently transported into cells after endocytosis initiation by the beads. The method may be 
improved further by treatment of the beads to increase hydrophobicity and thereby facilitate disruption of the 
5 endosome and release of the DNA into the cytoplasm . 

Liposomes that can act as gene delivery vehicles are described in US 5,422,120, W 095/13796, W 094/23697, 
W 091/14445 and EP-524,968. As described in USSN. 60/023,867, on non-viral delivery, the nucleic acid 
sequences encoding a polypeptide can be inserted into conventional vectors that contain conventional control 
sequences for high level expression, and then be incubated with synthetic gene transfer molecules such as 

10 polymeric DNA-binding cations like polylysine, protamine, and albumin, linked to cell targeting ligands such as 
asialoorosomucoid, insulin, galactose, lactose, or transferrin. Other delivery systems include the use of 
liposomes to encapsulate DNA comprising the gene under the control of a variety of tissue-specific or 
ubiquitously-active promoters. Further non-viral delivery suitable for use includes mechanical delivery systems 
such as the approach described in Woffendin et al (1994) Proc. Natl. Acad. ScL USA 91 (24): 1 1581-11585. 

15 Moreover, the coding sequence and the product of expression of such can be delivered through deposition of 
photopolymerized hydrogel materials. Other conventional methods for gene delivery that can be used for 
delivery of the coding sequence include, for example, use of hand-held gene transfer particle gun, as described in 
US 5,149,655; use of ionizing radiation for activating transferred gene, as described in US 5,206,152 and 
WO92/11033 

20 Exemplary liposome and polycationic gene delivery vehicles are those described in US 5,422,120 and 
4,762,915; in WO 95/13796; W094/23697; and W091/14445; in EP-0524968; and in Stryer, Biochemistry, 
pages 236-240 (1975) W.HL Freeman, San Francisco; Szoka (1980) Biochem Biophys Acta 600:1; Bayer (1979) 
Biochem Biophys Acta 550:464; Rivnay (1987) Meth Enzymol 149:119; Wang (1987) Proc Natl Acad Sci 
84:7851; Plant (1989) Anal Biochem 176:420. 

25 A polynucleotide composition can comprises therapeutically effective amount of a gene therapy vehicle, as the 
term is defined above. For purposes of the present invention, an effective dose will be from about O.01 mg/ kg to 
50 mg/kg or 0.05 mg/kg to about 10 mg/kg of the DNA constructs in the individual to which it is administered. 

Delivery Methods 

Once formulated, the polynucleotide compositions of the invention can be administered (1) directly to the 
30 subject; (2) delivered ex vivo, to ceils derived from the subject; or (3) in vitro for recombinant protein 
expression. The subjects to be treated can be mammals or birds. Also, human subjects can be treated. 

Direct delivery of the compositions will generally be accomplished by injection, either subcutaneously, 
intraperitoneally, intravenously or intramuscularly or delivered to the interstitial space of a tissue. The 
compositions can also be administered into a lesion. Other modes of administration include oral and pulmonary 
35 administration, suppositories, and transdermal or transcutaneous applications (e.g. see WO98/20734), needles, 
and gene guns or hyposprays. Dosage treatment may be a single dose schedule or a multiple dose schedule. 

Methods for the ex vivo delivery and reimplantation of transformed cells into a subject are known in the art and 
described in e.g. W093/14778. Examples of cells useful in ex vivo applications include, for example, stem cells, 
particularly hematopoetic, lymph cells, macrophages, dendritic cells, or tumor cells. 
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Generally, delivery of nucleic acids for both ex vivo and in vitro applications can be accomplished by the 
following procedures, for example, dextran-mediated transfection, calcium phosphate precipitation, poiybrene 
mediated transfection, protoplast fusion, electroporation, encapsulation of the polynucleotide(s) in liposomes, 
and direct microinjection of the DNA into nuclei, all well known in the art. 
5 Polynucleotide and polypeptide pharmaceutical compositions 

In addition to the pharmaceutical^ acceptable carriers and salts described above, the following additional agents 
can be used with polynucleotide and/or polypeptide compositions. 

A. Polypeptides 

One example are polypeptides which include, without limitation: asioloorosomucoid (ASOR); transferrin; 
10 asialoglycoproteins; antibodies; antibody fragments; ferritin; interleukins; interferons, granulocyte, macrophage 
colony stimulating factor (GM-CSF), granulocyte colony stimulating factor (G-CSF), macrophage colony 
stimulating factor (M-CSF), stem cell factor and erythropoietin. Viral antigens, such as envelope proteins, can 
also be used. Also, proteins from other invasive organisms, such as the 17 amino acid peptide from the 
circumsporozoite protein of Plasmodium falciparum known as RII. 

15 B .Hormones, Vitamins, etc. 

Other groups that can be included are, for example; hormones, steroids, androgens, estrogens, thyroid hormone, 
or vitamins, folic acid. 

C. Polvalkvlenes. Polysaccharides, etc. 

Also, polyalkylene glycol can be included with the desired polynucleotides/polypeptides. In a preferred 
20 embodiment, the polyalkylene glycol is polyethlylene glycol. In addition, mono-, di-, or polysaccharides can be 
included. In a preferred embodiment of this aspect, the polysaccharide is dextran or DEAE-dextran. Also, 
chitosan and poly(lactide-co-glycolide) 

D. Lipids, and Liposomes 

The desired polynucleotide/polypeptide can also be encapsulated in lipids or packaged in liposomes prior to 
25 delivery to the subject or to cells derived therefrom. 

Lipid encapsulation is generally accomplished using liposomes which are able to stably bind or entrap and retain 
nucleic acid. The ratio of condensed polynucleotide to lipid preparation can vary but will generally be around 
1:1 (mg DNA:micromoIes lipid), or more of lipid. For a review of the use of liposomes as carriers for delivery of 
nucleic acids, see, Hug and Sleight (1991) Biochim. Biophys, Acta, 1097:1-17; Straubinger (1983) Meth. 
30 BnzymoL 101:512-527, 

Liposomal preparations for use in the present invention include cationic (positively charged), anionic (negatively 
charged) and neutral preparations. Cationic liposomes have been shown to mediate intracellular delivery of 
plasmid DNA (Feigner (1987) Proc, Natl Acad, ScL USA 84:7413-7416); mRNA (Malone (1989) Proc. Natl 
Acad, ScL USA 86:6077-6081); and purified transcription factors (Debs (1990) J. Biol Chem. 
35 265:10189-10192), in functional form. 

Cationic liposomes are readily available. For example, N[l-2,3-dioleyloxy)propyl]-N,N,N-triethylammonium 
(DOTMA) liposomes are available under the trademark Lipofectin, from GIBCO BRL, Grand Island, NY. (See, 
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also, Feigner supra). Other commercially available liposomes include transfectace (DDAB/DOPE) and 
DOTAP/DOPE (Boerhinger). Other cationic liposomes can be prepared from readily available materials using 
techniques well known in the art. See, e.g. Szoka (1978) Proc. Natl Acad. Scu USA 75:4194-4198; 
WO90/11092 for a description of the synthesis of DOTAP (l ) 2-bis(oleoyloxy)-3-(trimethylammonio)propane) 
5 liposomes. 

Similarly, anionic and neutral liposomes are readily available, such as from Avanti Polar Lipids (Birmingham, 
AL), or can be easily prepared using readily available materials. Such materials include phosphatidyl choline, 
cholesterol, phosphatidyl ethanolamine, dioleoylphosphatidyl choline (DOPC), dioleoylphosphatidyl glycerol 
(DOPG), dioleoylphoshatidyl ethanolamine (DOPE), among others. These materials can also be mixed with the 
10 DOTMA and DOTAP starting materials in appropriate ratios. Methods for making liposomes using these 
materials are well known in the art. 

The liposomes can comprise multilammelar vesicles (MLVs), small unilamellar vesicles (SUVs), or large 
unilamellar vesicles (LUVs). The various liposome-nucleic acid complexes are prepared using methods known 
in the art. See e.g. Straubinger (1983) Meth. Immunol. 301:512-527; Szoka (1978) Proc. Natl Acad. Sci. VSA 
15 75:4194-4198; Papahadjopoulos (1975) Biochim. Biophys. Acta 394:483; Wilson (1979) Cell 17:77); Deamer & 
Bangham (1976) Biochim. Biophys. Acta 443:629; Ostro (1977) Biochem. Biophys, Res. Commun. 76:836; 
Fraley (1979) Proc. Natl Acad. Sci. USA 76:3348); Enoch & Strittmatter (1979) Proc. Natl Acal Sci. USA 
76:145; Fraley (1980) J. Biol. Chem. (1980) 255:30431; Szoka & Papahadjopoulos (1978) Proc. Natl Acad. Sci 
USA 75:145; and Schaefer-Ridder (1982) Science 215:166. 

20 E.Lipoproteins 

In addition, lipoproteins can be included with the polynucleotide/polypeptide to be delivered. Examples of 
lipoproteins to be utilized include: chylomicrons, HDL, IDL, LDL, and VLDL. Mutants, fragments, or fusions 
of these proteins can also be used. Also, modifications of naturally occurring lipoproteins can be used, such as 
acetylated LDL. These lipoproteins can target the delivery of polynucleotides to cells expressing lipoprotein 
25 receptors. Preferably, if lipoproteins are including with the polynucleotide to be delivered, no other targeting 
ligand is included in the composition. 

Naturally occurring lipoproteins comprise a lipid and a protein portion. The protein portion are known as 
apoproteins. At the present, apoproteins A, B, C, D, and E have been isolated and identified. At least two of 
these contain several proteins, designated by Roman numerals, AL All, AIV; CI, CII, CIII. 

30 A lipoprotein can comprise more than one apoprotein. For example, naturally occurring chylomicrons comprises 
of A,B,C,& E, over time these lipoproteins lose A and acquire C and E apoproteins. VLDL comprises A, B, C, 
& E apoproteins, LDL comprises apoprotein B; HDL comprises apoproteins A, C, & E. 

The amino acid of these apoproteins are known and are described in, for example, Breslow (1985) Annu Rev. 
Biochem 54:699; Law (1986) Adv. Exp Med. Biol. 151:162; Chen (1986) J Biol Chem 261:12918; Kane (1980) 
35 Proc Natl Acad Sci USA 77:2465; and Utermann (1984) Hum Genet 65:232. 

Lipoproteins contain a variety of lipids including, triglycerides, cholesterol (free and esters), and phospholipids. 
The composition of the lipids varies in naturally occurring lipoproteins. For example, chylomicrons comprise 
mainly triglycerides. A more detailed description of the lipid content of naturally occurring lipoproteins can be 
found, for example, in Meth. Enzymol 128 (1986), The composition of the lipids are chosen to aid in 
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conformation of the apoprotein for receptor binding activity. The composition of lipids can also be chosen to 
facilitate hydrophobic interaction and association with the polynucleotide binding molecule. 

Naturally occurring lipoproteins can be isolated from serum by ultracentrifugation, for instance. Such methods 
are described in Meth. Enzymol (supra)\ Pitas (1980) J. Biochenu 255:5454-5460 and Mahey (1979) / Clin, 
5 Invest 64:743-750. Lipoproteins can also be produced by in vitro or recombinant methods by expression of the 
apoprotein genes in a desired host cell. See, for example, Atkinson (1986) Annu Rev Biophys Chem 15:403 and 
Radding (1958) Biochim Biophys Acta 30: 443. Lipoproteins can also be purchased from commercial suppliers, 
such as Biomedical Technologies, Inc., Stoughton, Massachusetts, USA, Further description of lipoproteins can 
be found in Zuckermann et al PCT/US97/14465. 

10 F.Polycationic Agents 

Polycationic agents can be included, with or without lipoprotein, in a composition with the desired 
polynucleotide/polypeptide to be delivered. 

Polycationic agents, typically, exhibit a net positive charge at physiological relevant pH and are capable of 
neutralizing the electrical charge of nucleic acids to facilitate delivery to a desired location. These agents have 
15 both in vitro, ex vivo, and in vivo applications. Polycationic agents can be used to deliver nucleic acids to a 
living subject either intramuscularly, subcutaneously, etc. 

The following are examples of useful polypeptides as polycationic agents: polylysine, polyarginine, 
polyornithine, and protamine. Other examples include histones, protamines, human serum albumin, DNA 
binding proteins, non-histone chromosomal proteins, coat proteins from DNA viruses, such as (X174, 
20 transcriptional factors also contain domains that bind DNA and therefore may be useful as nucleic aid 
condensing agents. Briefly, transcriptional factors such as C/CEBP, c-jun, c-fos, AP-1, AP-2, AP-3, CPF, Prot-1 , 
Sp-1 , OcM, Oct-2, CREP, and TFIID contain basic domains that bind DNA sequences. 

Organic polycationic agents include: spermine, spermidine, and purtrescine. 

The dimensions and of the physical properties of a polycationic agent can be extrapolated from the list above, to 
25 construct other polypeptide polycationic agents or to produce synthetic polycationic agents. 

Synthetic polycationic agents which are useful include, for example, DEAE-dextran, polybrene. Lipofectin™, 
and IipofectAMINE™ are monomers that form polycationic complexes when combined with 
polynucleotides/polypeptides. 
Nucleic Acid Hybridisation 

30 "Hybridization" refers to the association of two nucleic acid sequences to one another by hydrogen bonding. 
Typically, one sequence will be fixed to a solid support and the other will be free in solution. Then, the two 
sequences will be placed in contact with one another under conditions that favor hydrogen bonding. Factors that 
affect this bonding include: the type and volume of solvent; reaction temperature; time of hybridization; 
agitation; agents to block the non-specific attachment of the liquid phase sequence to the solid support 

35 (Denhardt's reagent or BLOTTO); concentration of the sequences; use of compounds to increase the rate of 
association of sequences (dextran sulfate or polyethylene glycol); and the stringency of the washing conditions 
following hybridization. See Sambrook et al [supra] vol.2, chapt.9, pp.9 .47 to 9.57. 



BNSDOCID; <WO 02O2606A2_l_> 



WO 02/02606 



-29- 



PCT/IB01/01445 



"Stringency" refers to conditions in a hybridization reaction that favor association of very similar sequences over 
sequences that differ. For example, the combination of temperature and salt concentration should be chosen that 
is approximately 120 to 200°C below the calculated Tm of the hybrid under study. The temperature and salt 
conditions can often be determined empirically in preliminary experiments in which samples of genomic DNA 
5 immobilized on filters are hybridized to the sequence of interest and then washed under conditions of different 
stringencies. See Sambrook et al at page 9.50. 

Variables to consider when performing, for example, a Southern blot are (1) the complexity of the DNA being 
blotted and (2) the homology between the probe and the sequences being detected. The total amount of the 
fragment(s) to be studied can vary a magnitude of 10, from 0.1 to lug for a plasmid or phage digest to 10* 9 to 

10 10" s g for a single copy gene in a highly complex eukaryotic genome. For lower complexity polynucleotides, 
substantially shorter blotting, hybridization, and exposure times, a smaller amount of starting polynucleotides, 
and lower specific activity of probes can be used. For example, a single-copy yeast gene can be detected with an 
exposure time of only 1 hour starting with 1 pg of yeast DNA, blotting for two hours, and hybridizing for 4-8 
hours with a probe of 10 s cpm/p g. For a single-copy mammalian gene a conservative approach would start with 

15 10 jig of DNA, blot overnight, and hybridize overnight in the presence of 10% dextran sulfate using a probe of 
greater than 10 s cpm/\i g, resulting in an exposure time of -24 hours. 

Several factors can affect the melting temperature (Tm) of a DNA-DNA hybrid between the probe and the 
fragment of interest, and consequently, the appropriate conditions for hybridization and washing. In many cases 
the probe is not 100% homologous to the fragment. Other commonly encountered variables include the length 
20 and total G+C content of the hybridizing sequences and the ionic strength and formamide content of the 
hybridization buffer. The effects of all of these factors can be approximated by a single equation: 

Tm= 81 + 16.6(IogioCi) + 0.4[%(G + C)]-0.6(%form amide) - 600/«-1.5(%misraatch). 

where Ci is the salt concentration (monovalent ions) and n is the length of the hybrid in base pairs (slightly 
modified from Meinkoth & Wahi (1984) Aba/. Biochem. 138: 267-284). 

25 In designing a hybridization experiment, some factors affecting nucleic acid hybridization can be conveniently 
altered. The temperature of the hybridization and washes and the salt concentration during the washes are the 
simplest to adjust As the temperature of the hybridization increases (ie, stringency), it becomes less likely for 
hybridization to occur between strands that are nonhomologous, and as a result, background decreases. If the 
radiolabeled probe is not completely homologous with the immobilized fragment (as is frequently the case in 

30 gene family and interspecies hybridization experiments), the hybridization temperature must be reduced, and 
background will increase. The temperature of the washes affects the intensity of the hybridizing band and the 
degree of background in a similar manner. The stringency of the washes is also increased with decreasing salt 
concentrations. 

In general, convenient hybridization temperatures in the presence of 50% formamide are 42°C for a probe with 
35 is 95% to 100% homologous to the target fragment, 37°C for 90% to 95% homology, and 32°C for 85% to 90% 
homology. For lower homologies, formamide content should be lowered and temperature adjusted accordingly, 
using the equation above. If the homology between the probe and the target fragment are not known, the 
simplest approach is to start with both hybridization and wash conditions which are nonstringent. If non-specific 
bands or high background are observed after autoradiography, the filter can be washed at high stringency and 
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■reexposed. If the time required for exposure makes this approach impractical, several hybridization and/or 
washing stringencies should be tested in parallel. 
Nucleic Acid Probe Assays 

Methods such as PCR, branched DNA probe assays, or blotting techniques utilizing nucleic acid probes 
5 according to the invention can determine the presence of cDNA or mRNA. A probe is said to "hybridize" with a 
sequence of the invention if it can form a duplex or double stranded complex, which is stable enough to be 
detected. 

The nucleic acid probes will hybridize to the Chlamydial nucleotide sequences of the invention (including both 
sense and antisense strands). Though many different nucleotide sequences will encode the amino acid sequence, 
10 the native Chlamydial sequence is preferred because it is the actual sequence present in cells. mRNA represents 
a coding sequence and so a probe should be complementary to the coding sequence; single-stranded cDNA is 
complementary to mRNA, and so a cDNA probe should be complementary to the non-coding sequence. 

The probe sequence need not be identical to the Chlamydial sequence (or its complement) — some variation in 
the sequence and length can lead to increased assay sensitivity if the nucleic acid probe can form a duplex with 

15 target nucleotides, which can be detected. Also, the nucleic acid probe can include additional nucleotides to 
stabilize the formed duplex. Additional Chlamydial sequence may also be helpful as a label to detect the formed 
duplex. For example, a non-complementary nucleotide sequence may be attached to the 5' end of the probe, with 
the remainder of the probe sequence being complementary to a Chlamydial sequence. Alternatively, 
non-complementary bases or longer sequences can be interspersed into the probe, provided that the probe 

20 sequence has sufficient complementarity with the a Chlamydial sequence in order to hybridize therewith and 
thereby form a duplex which can be detected. 

The exact length and sequence of the probe will depend on the hybridization conditions, such as temperature, 
salt condition and the like. For example, for diagnostic applications, depending on the complexity of the analyte 
sequence, the nucleic acid probe typically contains at least 10-20 nucleotides, preferably 15-25, and more 
25 preferably >30 nucleotides, although it may be shorter than this. Short primers generally require cooler 
temperatures to form sufficiently stable hybrid complexes with the template. 

Probes may be produced by synthetic procedures, such as the triester method of Matteucci etai [J. Am. Chem. 
Soc. (1981) 103:3185], or according to Urdea et al [Proc. Natl Acad. ScL USA (1983) 80: 7461], or using 
commercially available automated oligonucleotide synthesizers. 

30 The chemical nature of the probe can be selected according to preference. For certain applications, DNA or 
RNA are appropriate. For other applications, modifications may be incorporated e.g. backbone modifications, 
such as phosphorothioates or methylphosphonates, can be used to increase in vivo half-life, alter RNA affinity, 
increase nuclease resistance etc. [e.g. see Agrawal & Iyer (3995) Curr Opin Biotechnol 6:12-19; Agrawal (1996) 
TIBTECH 14:376-387]; analogues such as peptide nucleic acids may also be used [e.g. see Corey (1997) 

35 TIBTECH 15:224-229; Buchardter al (1993) TIBTECH 11:384-386], 

Alternatively, the polymerase chain reaction (PCR) is another well-known means for detecting small amounts of 
target nucleic acids. The assay is described in: Mullis et al [Meth. Enzymol. (1987) 155: 335-350]; US patents 
4,683,195 & 4,683,202. Two 'primers' hybridize with the target nucleic acids and are used to prime the reaction. 
The primers can comprise sequence that does not hybridize to the sequence of the amplification target (or its 
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complement) to aid with duplex stability or, for example, to incorporate a convenient restriction site. Typically, 
such sequence will flank the desired Chlamydial sequence. 

A thermostable polymerase creates copies of target nucleic acids from the primers using the original target 
nucleic acids as a template. After a threshold amount of target nucleic acids are generated by the polymerase, 
5 they can be detected by more traditional methods, such as Southern blots. When using the Southern blot method, 
the labelled probe will hybridize to the Chlamydial sequence (or its complement). 

Also, mRNA or cDNA can be detected by traditional blotting techniques described in Sambrook et al [supra]. 
mRNA, or cDNA generated from mRNA using a polymerase enzyme, can be purified and separated using gel 
electrophoresis. The nucleic acids on the gel are then blotted onto a solid support, such as nitrocellulose. The 
10 solid support is exposed to a labelled probe and then washed to remove any unhybridized probe. Next, the 
duplexes containing the labeled probe are detected. Typically, the probe is labelled with a radioactive moiety, 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figures 1-189 show data pertaining to examples 1-189. 

Figure 190 shows a representative 2D gel of proteins in elementary bodies* 

15 Figure 191 shows an alignment of sequences in five (six) proteins of the invention. 

EXAMPLES 

The examples indicate C.pneumoniae proteins, together with evidence to support the view that the 
proteins are useful antigens for vaccine production and development or for diagnostic purposes. This 
evidence takes the form of: 

• Computer prediction based on sequence information from CWL029 strain (e.g. using the 
PSORT algorithm available from www.psortMbb.ac.jp). 

• Data on recombinant expression and purification of the proteins cloned from 1OL207 strain. 

• Western blots to demonstrate immunoreactivity in serum (typically a blot of an EB extract of 
C.pneumoniae strain FB/96 stained with mouse antiserum against the recombinant protein). 

• FACS analysis of C.pneumoniae bacteria or purified EBs to confirm accessibility of the 
antigen to the immune system (see also table HI). 

• An indication if the protein was identified by MALDI-TOF from a 2D gel electrophoresis 
map of proteins from purified elementary bodies from strain FB/96. This confirms that the 
protein is expressed in vivo (see also table V). 

30 Various tests can be used to assess the in vivo immunogenicity of the proteins identified in the 
examples. For example, the proteins can be expressed recombinantly and used to screen patient sera 
by immunoblot. A positive reaction between the protein and patient serum indicates that the patient 
has previously mounted an immune response to the protein in question ie. the protein is an 
immunogen. This method can also be used to identify immunodominant proteins. 
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The recombinant protein can also be conveniently used to prepare antibodies e.g. in a mouse. These 
can be used for direct confirmation that a protein is located on the cell-surface. Labelled antibody 
(e.g. fluorescent labelling for FACS) can be incubated with intact bacteria and the presence of label 
on the bacterial surface confirms the location of the protein. 

5 In particular, the following methods (A) to (O) were used to express, purify and biochemically 
characterise the proteins of the invention: 

CLONING OF CPN ORFs FOR EXPRESSION IN KCOLI 

ORFs of Chlamydia pneumoniae (Cpn) were cloned in such a way as to potentially obtain three 
different kind of proteins: 
10 a) proteins having an hexa-histidine tag at the C-terminus (cpn-His) 

b) proteins having a GST fusion partner at the N-terminus (Gst-cpn) 

c) proteins having both hexa-histidine tag at the C-terminus and GST at the N-terminus 
(GST/His fusion; NH 2 -GST-cpn-(His) 6 -COOH) 

The type a) proteins were obtained upon cloning in the pET21b+ (Novagen). The type b) and c) 
15 proteins were obtained upon cloning in modified pGEX-KG vectors [Guan & Dixon (1991) Anal 
Biochem. 192:262]. For instance pGEX-KG was modified to obtain pGEX-NN, then by modifying 
pGEX-NN to obtain pGEX-NNH. The Gst-cpn and Gst-cpn-His proteins were obtained in pGEX- 
NN and pGEX-NNH respectively. 

The modified versions of pGEX-KG vector were made with the aim of allowing the cloning of 
20 single amplification products in all three vectors after only one double restriction enzyme digestion 
and to minimise the presence of extraneous amino acids in the final recombinant proteins. 

(A) Construction of pGEX-NN and pGEX-NNH expression vectors 

Two couples of complementary oligodeoxyribonucleotides were synthesised using the DNA 
synthesiser ABD94 (Perkin Elmer) and the reagents from Cruachem (Glasgow, Scodand). Equimolar 
25 amounts of the oligo pairs (50 ng each oligo) were annealed in T4 DNA ligase buffer (New England 
Biolabs) for 10 min in a final volume of 50ul and then were left to cool slowly at room temperature. 
With the described procedure he following DNA linkers were obtained: 

gexNN linker: 

Ndel Khel Xmal EcoRI NcoX Sail Xhol SacI NotI 

30 GATCCCATATGGCTAGCCCGGGGAATTCGTCCATGGAGTGAGTCGACTGACTCGAGTGATCGAGCTCCTGAGCGGCCGCATGAA 

GGTATACCGATCGGGCCCCTTAAGCAGGTACCTCACTCAGCTGACTGAGCTCACTAGCTCGAGGACTCGCCGGCGTACTTTCGA 

gexNNH linker: 

Hindlll NotI Xhol — Hexa-Histidine — 
35 TCGACAAGCTTGCGGCCGCACTCGAGCATCACCATCACCATCACTGAT 

GTTCGAACGGCGGCGTGAGCACGTAGAiGGTAGTGGTAGTGACTATCGA 

The plasmid pGEX-KG was digested with BamHI and HindlU and 100 ng were ligated overnight at 
16 °C to the linker gexNN with a molar ratio of 3:1 linker/plasmid using 200 units of T4 DNA ligase 



BNSDOCID: <WO 0202606A2_L> 



WO 02/02606 PCT/IB01/01445 

-33- 

(New england Biolabs). After transformation of the ligation product in E. coli DH5, a clone 
containing the pGEX-NN plasmid, having the correct linker, was selected by means of restriction 
enzyme analysis and DNA sequencing. 

The new plasmid pGEX-NN was digested with Sail and Hindm and ligated to the linker gexNNH, 
5 After transformation of the ligation product in E. coli DH5, a clone containing the pGEX-NNH 
plasmid, having the correct linker, was selected by means of restriction enzyme analysis and DNA 
sequencing. 

(B) Chromosomal DNA preparation 

The chromosomal DNA of elementary bodies (EB) of C.pneumoniae strain 1OL-207 was prepared by 
10 adding 1 .5 ml of lysis buffer (1 0 mM Tris-HCl, 1 50 mM NaCl, 2 mM EDTA, 0,6 % SDS, 1 00 fig/ml 
Proteinase K, pH 8) to 450 ul EB suspension (400.000/ul) and incubating overnight at 37 °C. After 
sequential extraction with phenol, phenol-chloroform, and chloroform, the DNA was precipiiai ;d 
with 0,3 M sodium acetate, pH 5,2 and 2 volumes of absolute ethanol. The DNA pellet was washed 
with 70 % ethanol. After solubilization with distilled water and treatment with 20 ng/ml RNAse A 
15 for 1 hour at RT, the DNA was extracted again with phenol-chloroform, alcohol precipitated and 
suspended with 300 ui 1 mM Tris-HCl pH 8,5. The DNA concentration was evaluated by measuring 
OD 2 6o of the sample. 

(C) Oligonucleotide design 

Synthetic oligonucleotide primers were designed on the basis of the coding sequence of each ORF 
20 using the sequence of Gpneumortiae strain CWL029. Any predicted signal peptide were omitted, by 
deducing the 5' end amplification primer sequence immediately downstream from the predicted 
leader sequence. For most ORFs, the 5' tail of the primers (table X) included only one restriction 
enzyme recognition site (Ndel, or Nhel, or Spel depending on the gene's own restriction pattern); the 
3' primer tails (tablel) included a Xhol or a Not! or a Hindlll restriction site. 



5' tails 


3' tails 


Ndel 5' GTGCGTCATATG 3' 


Xhol 5' GCGTCTCGAG 3' 


Nhel 5' GTGCGTGCTAGC 3' 


NotI 5' ACTCGCTAGCGGCCGC 3' 


Spel 5' GTGCGTACTAGT 3' 


Hindin 5' GCGTAAGCTT 3' 



25 Table I. Oligonucleotide tails of the primers used to amplify Cpn genes. 

As well as containing the restriction enzyme recognition sequences, the primers included nucleotides 
which hybridized to the sequence to be amplified. The number of hybridizing nucleotides depended 
on the melting temperature of the primers which was determined as described [(Breslauer et at 
(1986) PNAS USA 83:3746-50]. The average melting temperature of the selected oligos was 50-55°C 
30 for the hybridizing region alone and 65-75°C for the whole oligos. Table II shows the forward and 
reverse primers used for each amplification. 

BNSDOCID: <WO 0202606A2J_> 



WO 02/02606 



-34- 



PCT/IB01/01445 



10 



(D) Amplification 

The standard PCR protocol was as follow: 50 ng genomic DNA were used as template in the 
presence of 0,2 pM each primer, 200 \)M each dNTP, 1,5 mM MgCl 2) lx PCR buffer minus Mg 
(Gibco-BRL), and 2 units of Taq DNA polymerase (Platinum Taq, Gibco-BRL) in a final volume of 
100 pi. Each sample underwent a double-step amplification: the first 5 cycles were performed using 
as the hybridizing temperature the one of the oligos excluding the restriction enzyme tail, followed 
by 25 cycles performed according to the hybridization temperature of the whole lenght primers. The 
standard cycles were as follow: 

denaturation : 94 °C, 2 min 



denaturation: 94 °C, 30 seconds 1 
hybridization: 51 °C, 50 seconds J 5 cycles 

elongation: 72 °C t 1 min or 2 min and 40 sec 

15 denaturation: 94 °C, 30 seconds "] 

hybridization: 70 °C, 50 seconds j 25 cycles 

elongation: 72 °C, 1 min or 2 min and 40 sec 

72 °C, 7 min 
20 4°C 

The elongation time was 1 min for ORFs shorter than 2000 bp, and 2 min and 40 seconds for ORFs 
longer than 2000 bp. The amplifications were performed using a Gene Amp PCR system 9600 
(Perkin Elmer). 

25 To check the amplification results, 4 jul of each PCR product was loaded onto 1-1.5 agarose gel and 
the size of amplified fragments compared with DNA molecular weight standards (DNA markers HI 
or IX, Roche). The PCR products were loaded on agarose gel and after electrophoresis the right size 
bands were excised from the gel. The DNA was purified from the agarose using the Gel Extraction 
Kit (Qiagen) following the instruction of the manufacturer. The final elution volume of the DNA was 

30 50 \xl TE (10 mM Tris-HCl, 1 mM EDTA, pH 8). One ^1 of each purified DNA was loaded onto 
agarose gel to evaluate the yield. 

(E) Digestion of PCR fragments 

One-two \ig of purified PCR product were double digested overnight at 37 °C with the appropriate 
restriction enzymes (60 units of each enzyme) using the appropriate restriction buffer in 100 \il final 
35 volume. The restriction enzymes and the digestion buffers were from New England Biolabs. After 
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purification of the digested DNA (PCR purification Kit, Qiagen) and elution with 30 |xl TE, 1 jlxI was 
subjected to agarose gel electrophoresis to evaluate the yield in comparison to titrated molecular 
weight standards (DNA markers m or IX, Roche). 

(F) Digestion of the cloning vectors (pET21b+, pGEX-NN, and pGEX-NNH) 
5 10 ug of plasmid was double digested with 100 units of each restriction enzyme in 400 pi reaction 
volume in the presence of appropriate buffer by overnight incubation at 37 °C. After electrophoresis 
on a 1% agarose gel, the band corresponding to the digested vector was purified from the gel using 
the Qiagen Qiaex II Gel Extraction Kit and the DNA was eluted with 50 uJ TE. The DNA 
concentration was evaluated by measuring OD260 of the sample. 

10 (G) Cloning 

75ng of the appropriately digested and purified vectors and the digested and purified fragments 
corresponding to each ORF, were ligated in final volumes of 10-20 ui with a molar ratio of 1:1 
fragment/vector, using 400 units T4 DNA ligase (New England Biolabs) in the presence of the buffer 
supplied by the manufacturer. The reactions were incubated overnight at 16 °C. 

15 Transformation in E coli DH5 competent cells was performed as follow: the ligation reaction was 
mixed with 200 ui of competent DH5 cells and incubated on ice for 30 min and then at 42 °C for 90 
seconds. After cooling on ice, 0.8 ml LB was added and the cells were incubated for 45 min at 37 °C 
under shaking. 100 and 900 p.1 of cell suspensions were plated on separate plates of agar LB 100 
ug/ml Ampicillin and the plates were incubated overnight at 37 °C. The screening of the 

20 transformants was done by growing randomly chosen clones in 6 ml LB 100 ug/ml Ampicillin, by 
extracting the DNA using the Qiagen Qiaprep Spin Miniprep Kit following the manufacturer 
instructions, and by digesting 2 jLtl of plasmid minipreparation with the restriction enzymes specific 
for the restriction cloning sites. After agarose gel electrophoresis of the digested plasmid mini- 
preparations, positive clones were chosen on the basis of the correct size of the restriction fragments, 

25 as evaluated by comparison with appropriate molecular weight markers (DNA markers HI or IX, 
Roche). 

(H) Expression 

1 ui of each right plasmid mini-preparation was transformed in 200 jul of competent E. coli strain 
suitable for expression of the recombinant protein. All pET21b+ recombinant plasmids were 

30 transformed in BL21 DE3 (Novagen) K coli cells, whilst all pGEX-NN and all pGEX-NNH 
recombinant plasmids were transformed in BL21 cells (Novagen). After plating transformation 
mixtures on LB/Amp agar plates and incubation overnight at 37 °C, single colonies were inoculated 
in 3 ml LB 100 ug/ml Ampicillin and grown at 37 °C overnight. 70 ul of the overnight culture was 
inoculated in 2 ml LB/Amp and grown at 37 °C until OD 6 oo of the pET clones reached the 0,4-0,8 

35 value or until OD 60 o of the pGEX clones reached the 0,8-1 value. Protein expression was then 
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induced by adding IPTG (Isopropil (3-D thio-galacto-piranoside) to the mini-cultures. pET clones 
were induced using 1 mM IPTG, whilst pGEX clones were induced using 0.2 mM IPTG. After 3 
hours incubation at 37 °C the final ODeoo was checked and the cultures were cooled on ice. After 
centrifugation of 0.5 ml culture, the cell pellet was suspended in 50 jil of protein Loading Sample 
5 Buffer (60 mM TRIS-HC1 pH 6.8, 5% w/v SDS, 10% v/v glycerin, 0.1% w/v Bromophenol Blue, 
100 mM DTT) and incubated at 100 °C for 5 min. A volume of boiled sample corresponding to 0.1 
ODeoo culture was analysed by SDS-PAGE and Coomassie Blue staining to verify the presence of 
induced protein band. 

PURIFICATION OF THE RECOMBINANT PROTEINS 

10 Single colonies were inoculated in 25 ml LB 100 [xg/ml Ampicillin and grown at 37 °C overnight. 
The overnight culture was inoculated in 500 ml LB/Amp and grown under shaking at 25 °C until 
OD 6 oo 0,4-0,8 value for the pET clones, or until ODeoo 0,8-1 value for the pGEX clones. Protein 
expression was then induced by adding IPTG to the cultures. pET clones were induced using 1 mM 
IPTG, whilst pGEX clones were induced using 0.2 mM IPTG. After 4 hours incubation at 25 °C the 

15 final ODeoo was checked and the cultures were cooled on ice. After centrifugation at 6000 rpm (JA10 
rotor, Beckman), the cell pellet was processed for purification or frozen at -20 °C. 

(I) Procedure for the purification of soluble His-tagged proteins from E.coli 

1. Transfer the pellets from »20°C to ice bath and reconstitute with 10 ml 50 mM NaHP04 buffer, 
300 mM NaCl, pH 8,0, pass in 40-50 ml centrifugation tubes and break the cells as per the 

20 following outline: 

2. Break the pellets in the French Press performing three passages with in-line washing. 

3. Centrifuge at about 30-40000 x g per 15-20 min. If possible use rotor JA 25.50 (21000 rpm, 15 
min.) or JA-20 (18000 rpm, 15 min.) 

4. Equilibrate the Poly-Prep columns with 1 ml Fast Flow Chelating Sepharose resin with 50 mM 
25 phosphate buffer, 300 mM NaCl, pH 8,0. 

5. Store the centrifugation pellet at -20°C, and load the supernatant in the columns. 

6. Collect the flow through. 

7. Wash the columns with 10 ml (2 ml + 2 ml + 4 ml) 50 mM phosphate buffer, 300 mM NaCl, pH 
8,0. 

30 8. Wash again with 10 ml 20 mM imidazole buffer, 50 mM phosphate, 300 mM NaCl, pH 8,0. 

9. Elute the proteins bound to the columns with 4,5 ml (1,5 ml + 1,5 ml + 1,5 ml) 250 mM 
imidazole buffer, 50 mM phosphate, 300 mM NaCl, pH 8,0 and collect the 3 corresponding 
fractions of -1,5 ml each. Add to each tube 15 \x\ DTT 200 mM (final concentration 2 mM) 
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10. Measure the protein concentration of the first two fractions with the Bradford method, collect a 
10 jjg aliquot of proteins from each sample and analyse by SDS-PAGE. (N.B.: should the sample 
be too diluted, load 21 \xl + 7 jil loading buffer). 

1 1. Store the collected fractions at +4°C while waiting for the results of the SDS-PAGE analysis. 

5 12. For immunisation prepare 4-5 aliquots of 100 jug each in 0,5 nil in 40% glycerol. The dilution 
buffer is the above elution buffer, plus 2 mM DTT, Store the aliquots at -20°C until 
immunisation. 

(J) Purification of His-tagged proteins from Inclusion bodies 

Purifications were carried out essentially according the following protocol: 

10 1 . Bacteria are collected from 500 ml cultures by centrifugation. If required store bacterial pellets at 
-20°C. For extraction, resuspend each bacterial pellet in 10 ml 50 mM TRIS-HC1 buffer, pH 8,5 
on an ice bath. 

2. Disrupt the resuspended bacteria with a French Press, performing two passages. 

3. Centrifuge at 35000 x g for 15 rnin and collect the pellets. Use a Beckman rotor JA 25.50 (21000 
15 rpm, 15 min.) or JA-20 (18000 rpm, 15 min.). 

4. Dissolve the centrifugation pellets with 50 mM TRIS-HC1, 1 mM TCEP {Tris(2-caiboxyethyl)- 
phosphine hydrochloride, Pierce} , 6M guanidium chloride, pH 8,5. Stir for ~ 10 min. with a 
magnetic bar. 

5. Centrifuge as described above, and collect the supernatant.. 

20 6. Prepare an adequate number of Poly-Prep (Bio-Rad) columns containing 1 ml of Fast Flow 
Chelating Sepharose (Pharmacia) saturated with Nichel according to manufacturer 
recommendations.. Wash the columns twice with 5 ml of H 2 0 and equilibrate with 50 mM TRIS- 
HC1, 1 mM TCEP, 6M guanidinium chloride, pH 8,5. 

7. Load the supernatants from step 5 onto the columns, and wash with 5 ml of 50 mM TRIS-Hcl 
25 buffer, 1 mM TCEP, 6M urea, pH 8,5 

8. Wash the columns with 10 ml of 20 mM imidazole, 50 mM TRIS-HC1 , 6M urea, 1 mM TCEP, 
pH 8,5. Collect and set aside the first 5 ml for possible further controls. 

9. Elute the proteins bound to the columns with 4,5 ml of a buffer containing 250 mM imidazole, 50 
mM TRIS-HC1, 6M urea, 1 mM TCEP, pH 8,5. Add the elution buffer in three 1,5 ml aliquots, 

30 and collect the corresponding 3 fractions. Add to each fraction 15 ul DTT (final concentration 2 

mM). 

10. Measure eluted protein concentration with the Bradford method, and analyze aliquots of ca 10 |Xg 
of protein by SDS-PAGE. 

11. Store proteins at -20°C in 40% (v/v) glycerol, 50 mM TRIS-HC1, 2M urea, 0.5 M arginine, 2 mM 
35 DTT, 0.3 mM TCEP, 83.3 mM imidazole, pH 8,5 
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(K) Procedure for the purification of GST-fusion proteins from Exoli 

1. Transfer the bacterial pellets from -20°C to an ice bath and resuspend with 7,5 ml PBS, pH 7,4 
to which a mixture of protease inhibitors (C0MPLETE™ - Boehringer Mannheim, 1 tablet every 
25 ml of buffer) has been added. Transfer to 40-50 ml centrifugation tubes and sonicate 

5 according to the following procedure: 

a) Position the probe at about 0,5 cm from the bottom of the tube 

b) Block the tube with the clamp 

c) Dip the tube in an ice bath 

d) Set the sonicator as follows: Timer Hold, Duty Cycle ~> 55, Out. Control -> 6, 

10 e) perform 5 cycles of 10 impulses at a time lapse of 1 minute (i.e. one cycle = 10 impulses 

+ -45" hold; b. 10 impulses + -45" hold; c. 10 impulses + -45" hold; d. 10 impulses + 
-45" hold; e. 10 impulses + -45" hold) 

2. Centrifuge at about 30-40000 x g for 15-20 min. E.g.: use rotor Beckman JA 25.50 at 21000 
rpm, for 15 min. 

15 3. Store the centrifugation pellets at -20°C, and load the supernatants on the chromatography 
columns, as follows 

4. Equilibrate the Poly-Prep (Bio-Rad) columns with 0,5 ml (=1 ml suspension) of Glutathione- 
Sepharose 4B resin, wash with 2 ml (1 + 1) H 2 0, and then with 10 ml (2 + 4 + 4) PBS, pH 7,4. 

5. Load the supernatants on the columns and discard the flow through. 
20 6. Wash the columns with 10 ml (2 + 4 + 4) PBS, pH 7,4. 

7. Elute the proteins bound to the columns with 4,5 ml of 50 mM TRIS buffer, 10 mM reduced 
glutathione, pH 8.0, adding 1,5 ml + 1,5 ml + 1,5 ml and collecting the respective 3 fractions of 
-1,5 ml each. 

8. Measure the protein concentration of the first two fractions with the Bradford method, analyse a 
25 10 ug aliquot of proteins from each sample by SDS-PAGE. (N.B.: if the sample is too diluted 

load 21 ul (+ 7 uj loading buffer). 

9. Store the collected fractions at +4°C while waiting for the results of the SDS-PAGE analysis. 

10. For each protein destined to the immunisation prepare 4-5 aliquots of 100 ug each in 0,5 ml of 
40% glycerol. The dilution buffer is 50 mM TRJS.HC1, 2 mM DTT, pH 8,0. Store the aliquots at 

30 -20°C until immunisation.. 

SEROLOGY 

(L) Protocol of immunization 

1. Groups of four CD1 female mice aged between 6 and 7 weeks were immunized with 20 jug of 
recombinant protein resuspended in 100 pi. 
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2. Four mice for each group received 3 doses with a 14 days interval schedule, 

3. Immunization was performed through intra-peritoneal injection of the protein with an equal 
volume of Complete Freund's Adjuvant (CFA) for the first dose and Incomplete Freund's Adjuvant 
(EFA) for the following two doses. 

5 4. Sera were collected before each immunization. Mice were sacrified 14 days after the third 
immunization and the collected sera were pooled and stored at -20°C. 

(M) Western blot analysis of Cpn elementary body proteins with mouse sera 

Aliquots of elementary bodies containing approximately 4 ug of proteins, mixed with SDS loading 

buffer (lx: 60 mM TRIS-HC1 pH 6.8, 5% w/v SDS, 10% v/v glycerin, 0.1% Bromophenol Blue, 100 

10 mM DTT) and boiled 5 minutes at 95° C, were loaded on a 12% SDS-PAGE gel. The gel was run 
using a SDS-PAGE running buffer containing 250 mM TRIS, 2.5 mM Glycine and 0.1 %SDS. The 
gel was electroblotted onto nitrocellulose membrane at 200 mA for 30 minutes. The membrane was 
blocked for 30 minutes with PBS, 3% skimmed milk powder and incubated O/N at 4° C with the 
appropriate dilution (1/100) of the sera. After washing twice with PBS +0.1% Tween (Sigma) the 

15 membrane was incubated for 2 hours with peroxidase-conjugated secondary anti-mouse antibody 
(Sigma) diluted 1:3000. The nitrocellulose was washed twice for 10 minutes with PBS + 0.1% 
Tween-20 and once with PBS and thereafter developed by Opti-4CN Substrate Kit (Biorad). 

Lanes shown in Western blots are: (P) = pre-immune control serum; (I) = immune serum. 

(N) FACS analysis of Chlamydia pneumoniae elementary bodies with mouse sera 

20 1, 2x 10 5 Elementary Bodies (EB)Avell were washed with 200 jxl of PBS-0.1%BSA in a 96 wells U 
bottom plate and centrifuged for 10 min. at 1200rpm, at 4°C. 

2. The supernatant was discarded and the E.B- resuspended in 10 ul of PBS-0.1%BSA. 

3. lOpj mouse sera diluted in PBS-0.1%BSA were added to the E.B. suspention to a final dilution 
of 1:400, and incubated on ice for 30 min. 

25 4. EB were washed by adding 180ul PBS-0.1%BSA and centrifuged for lOmin. at 1200rpm, 4°C. 

5. The supernatant was discarded and the E.B. resuspended in 10 1 of PBS-0.1%BSA. 

6. lOjxl of a goat anti-mouse IgG ? F(ab') 2 fragment specific-R-Phycoerythrin-conjugated (Jackson 
Immunoresearch Laboratories Inc., cat.N°115-116-072) was added to the EB suspension to a 
final dilution of 1:100, and incubated on ice for 30 min. in the dark. 

30 7. EB were washed by adding 180jj1 PBS-0.1%BSA and centrifuged for lOmin. at 1200rpm, 4°C. 

8. The supernatant was discarded and the E.B. resuspended in 150 ul of PBS-0.1%BSA. 

9. E.B. suspension was passed through a cytometric chamber of a FACS Calibur (Becton Dikinson, 
Mountain View, CA USA) and 10.000 events were acquired. 
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10. Data were analysed using Cell Quest Software (Becton Dikinson, Mountain View, CA USA) by 
drawing a morphological dot plot (using forward and side scatter parameters) on E.B. signals. An 
histogram plot was then created on FL2 intensity of fluorescence log scale recalling the 
morphological region of EB. 

NB: the results of FACS depend not only on the extent of accessibility of the native antigens but also 
on the quality of the antibodies elicited by the recombinant antigens, which may have structures with 
a variable degree of correct folding as compared with the native protein structures. Therefore, even if 
a FACS assay appears negative this does not necessarily mean that the protein is not abundant or 
accessible on the surface. PorB antigen, for instance, gave negative results in FACS but is a surface- 
exposed neutralising antigen [Kubo & Stephens (2000) Mol Microbiol 38:772-780]. 

(O) Mass Spectrometry analysis of two-dimensional eiectrophoretic protein maps 

Gradient purified EBs from strain FB/96 were solubilized at a final concentration of 5.5mg/ml with 
immobiline rehydratation buffer (7M urea, 2M thiourea, 2% (w/v) CHAPS, 2% (w/v) ASB 14 
[Chevallet et al (1998) Electrophor. 19:1901-9], 2% (v/v) C.A 3-10NL (Amersham Pharmacia 
Biotech), 2 mM tributyl phosphine, 65 mM DTT). Samples (250pg protein) were adsorbed overnight 
on Immobiline DryStrips (7 cm, pH 3-10 non linear). Electrophocusing was performed in a IPGphor 
Isoelectric Focusing Unit (Amersham Pharmacia Biotech). Before PAGE separation, the focused 
strips were incubated in 4M urea, 2M thiourea, 30% (v/v) glycerol, 2% (w/v) SDS, 5rnM tributyl 
phosphine 2.5%(w/v) acrylamide, 50mM Tris-HCl pH 8.8, as described [Herbert et al (1998) 
Electrophor. 19:845-51], SDS-PAGE was performed on linear 9-16% acrylamide gradients. Gels 
were stained with colloidal Coomassie (Novex, San Diego) [Doherty et al (1998) Electrophor. 
19:355-63]. Stained gels were scanned with a Personal Densitometer SI (Molecular Dynamics) at 8 
bits and 50pm per pixel. Map images were annotated with the software Image Master 2D Elite, 
version 3.10 (Amersham Pharmacia Biotech). Protein spots were excised from the gel, using an Ettan 
Spot picker (Amersham Pharmacia Biotech), and dried in a vacuum centrifuge. In-gel digestion of 
samples for mass spectrometry and extraction of peptides were performed as described by Wilm et 
al [Nature (1996) 379:466-9]. Samples were desalted with a ZIP TIP (Millipore), eluted with a 
saturated solution of alpha-cyano-4-hydroxycinnamic acid in 50% acetonitrile, 0.1% TEA and 
directly loaded onto a SCOUT 381 multiprobe plate (Bruker). Spectra were acquired on a Bruker 
Biflex II MALDI-TOF. Spectra were calibrated using a combination of known standard peptides, 
located in spots adjacent to the samples. Resulting values for monoisotopic peaks were used for 
database searches using the computer program Mascot (www.matrixscience.com). All searches were 
performed using an error of 200-500ppm as constraint. A representative gel is shown in Figure 190. 

Example 1 

The following C.pnewnoniae protein (pid 4376552) was expressed <SEQ ID 1; cp6552>: 

1 K^KLSLLVG LIFVLS5 CHK EDAQNKIRIV ASPTPHAELL ESLQEEAKDL 
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51 GIKLKILFVD DYRIPNRLLL DKQVDANYFQ HQAFLDDECE RYDCKGELW 

101 IAKVHLEFQA IYSKKHSSLE RLKSQKKLTI AI PVDRTNAQ RALHIjLEECG 

151 LIVCKGPAWIi NMTAKDVCGK ENRSINILEV SAPLLVGSLP DVDAAVIPGN 

201 FAXAANLSPK KDSLCLEDtiS VSKYTNLWI RSEDVGSPKM IKLQKLFQSP 

5 251 SVQHFFDTKY HGNILTMTQD NG* 

A predicted signal peptide is highlighted. 

The cp6552 nucleotide sequence <SEQ ID 2> is: 

1 ATGAAAAAAA AATTATCATT ACTTGTAGGT TTAATTTTTG TTTTGAGTTC 

51 TTGCCATAAG GAAGATGCTC AGAATAAAAT ACGTATTGTA GCCAGTCCGA 

10 101 CACCTCATGC GGAATTATTG GAGAGOTTAC AGGAAGAGGC TAAAGATCTT 

151 GGAATCAAGC TGAAAATACT TCCAGTAGAT GATTATCGTA TTCCTAATCG 

201 TTTGCTTTTG GATAAACAAG TAGATGCAAA TTACTTTCAA CATCAAGCTT 

251 TTCTTGATGA CGAATGCGAG CGTTATGATT GTAAGGGTGA ATT AG TTGTT 

301 ATCGCTAAAG TTCATTTGGA ACCTCAAGCA ATTTATTCTA AGAAACATTC 

15 351 TTCTTTAGAG CGCTTAAAAA GCCAGAAGAA ACTGACTATA GCGATTCCTG 

401 TGGATCGTAC GAATGCTCAG CGTGCTCTAC ACTTGTTAGA AGAGTGCGGA 

451 CTCATTGTTT GCAAAGGGCC TGCTAATTTA AATATGACAG CTAAAGATGT 

501 CTGTGGGAAA GAAAATAGAA GTATCAACAT ATTAGAGGTG TCAGCTCCTC 

551 TTCTTGTCGG ATCTCTTCCT GACGTTGATG CTGCTGTCAT TCCTGGAAAT 

20 601 TTTGCTATAG CAGCAAACCT TTCTCCAAAG AAAGATAGTC TTTGTTTAGA 

651 GGATCTTTCG GTATCTAAGT ATACAAACCT TGTTGTCATT CGTTCTGAAG 

701 ACGTAGGTTC TCCTAAAATG ATAAAATTAC AGAAGCTGTT TCAATCTCCT 

751 TCTGTACAAC ATTTTTTTGA TACAAAATAT CATGGGAATA TTTTGACAAT 

801 GACTCAAGAC AATGGTTAG 

25 The PSORT algorithm predicts an inner membrane location (0.127). 

The protein was expressed in E.coli and purified as a his-tag product, as shown in Figure 1A, and 
also as a GST-fusion, The recombinant protein was used to immunise mice, whose sera were used in 
a Western blot (Figure IB) and for FACS analysis (Figure 1C). 

The cp6552 protein was also identified in the 2D-PAGE experiment (Cpn0278). 

30 These experiments show that cp6552 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 2 

The following C.pneumoniae protein (pid 4376736) was expressed <SEQ ID 3; cp6736>: 

1 MKTS IRKFI*I SgTIAFCFAS TAFT VEVIMF SENFDGSSGK IFPYTTLSDP 

35 51 RGTLCIFSGD LYIANLDNAI SRTSSSCFSN RAGALQILGK GGVFSFldSTIR 

101 SSADGAAISS VITQNPELCP LSFSGFSQMI FDNCESI/TSD TSASNVIPHA 

151 SAIYATTPMIi FTNNDSILFQ YNRSAGFGAA IRGTSITIEN TKKSLLFNGN 

201 GSISNGGALT GSAAINLINN SAPVIFSTNA TGIYGGAIYIi TGGSMLTSGN 

251 LSGVLFVNNS SRSGGAIYAN GNVTFSNNSD I»TFQNNTASP QNSIiPAPTPP 

40 301 PTPPAVTPLI, GYGGAIFCTP PATPPPTGVS IiTISGENSVT FLENIASEQG 

351 GALYGKKISI DSNKSTIFLG l!5TAGKGGAIA IPESGELSLS AWQGDILFNK 

401 NIjSITSGTPT RNSIHFGKDA KFATLGATQG YTLYFYDP I T SDDLSAASAA 

451 ATVWNPKAS ADGAYSGTIV F SGETLT ATE AATPANAT ST LNQKLtEkEGG 

501 TLALRNGATIi NVHNFTQDEK SWIMDAGTT LATTNGANNT DGAITLNKLV 

45 551 1NLDSLDGTK AAWNVQSTN GALTISGTLG IjVKNSQDCCD NHGMFNKDLQ 

601 QVPILELKAT SNTVTTTDFS LGTNGYQQSP YGYQGTWEFT IDTTTHTVTG 

651 WWKKTGYLPH PERLAPLIPN SLWAWIDLR AVSQASAAIX3 EDVPGKQL.SI 

701 TGITNFFHAN HTGDARSYRH MGGGYLINTY TRITPDAALS LGFGQLFTKS 

751 KDYLVGHGHS NVYFATVYSN ITKSLFGSSR FFSGGTSRVT YSRSNEKVKT 

50 801 SYTKLPKGRC SWSNNCWLGE LEGNLPITLS SRILNLKQ1I PFVKAEVAYA 

851 THGGIQENTP EGRIFGHGHL LNVAVPVGVR FGKNSHNRPD FYTIIVAYAP 

901 DVYRHNPDCD TTLPINGATW TSIGNNLTRS TLLVQASSHT SVNDVLEXFG 

951 HCGCDIRRTS RQYTLDIGSK LRF* 

A predicted signal peptide is highlighted. 
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The cp6736 nucleotide sequence <SEQ ID 4> is: 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



l 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 
2701 
2751 
2801 
2851 
2901 



ATGAAAACGT 
TTTTGCTTCA 
TTGATGGATC 
AGAGGGACAC 
TAATGCCATA 
CACTACAAAT 
TCTTCAGCTG 
ACTATGTCCC 
GTGAATCTTT 
TCGGCGATTT 
ACTATTCCAA 
CAAGCATCAC 
GGATCCATCT 
CATCAACAAT 
ATGGTGGGGC 
CTCTCAGGAG 
CTATGCTAAC 
AAAACAATAC 
CCTACACCAC 
CTGTACTCCT 
CTGGAGAAAA 
GGAGCCCTCT 
ATTTCTTGGA 
CTGGGGAGCT 
AACCTCAGCA 
AAAAGATGCC 
ACTTCTATGA 

GCTACTGTGG 
GACTATTGTC 
CTGCAAATGC 
ACTCTCGCTT 
AGATGAAAAG 
CAAATGGAGC 
ATCAATCTGG 
GAGTACCAAT 
ACTCTCAAGA 
CAAGTTCCGA 
GGACTTCAGT 
AAGGAACTTG 
AATTGGAAAA 
CATTCCTAAT 
AAGCGTCAGC 
ACAGGAATTA 
CTACCGCCAT 
CTCCAGATGC 
AAGGATTACC 
ATACTCTAAC 
GAGGCACTOC 
TCATATACAA 
GTTAGGAGAA 
TAAACCTCAA 
ACTCATGGGG 
CGGTCATCTA 
ATTCTCATAA 
GATGTCTATC 
AGCTACGTGG 
TACAAGCATC 
CACTGTGGAT 
AGGAAGCAAA 



CTATTCGTAA 
ACAGCGTTTA 
GAGTGGGAAG 
TCTGTATTTT 
TCCAGAACCT 
CTTAGGAAAA 
ACGGAGCCGC 
TTGAGTTTTT 
GACTTCAGAT 
ACGCTACAAC 
TACAACCGTT 
AATAGAAAAT 
CTAATGGAGG 
AGCGCTCCTG 
TATTTACCTT 
TCTTGTTCGT 
GGAAATGTCA 
AGCATCTCCA 
CAGCAGTCAC 
CCAGCTACCC 
CAGCGTTACA 
ATGGCAAAAA 
AATACAGCTG 
CTCTCTATCC 
TCACTAGTGG 
AAGTTTGCCA 
TCCGATTACA 
TCGTCAATCC 
TTTTCAGGAG 
TACATCTACA 
TAAGAAACGG 
TCCGTCGTCA 
TAATAATACT 
ATTCTTTGGA 
GGAGCTCTCA 
TTGCTGTGAC 
TTTTAGAACT 
CTCGGCACAA 
GGAGTTTACC 
AAACCGGTTA 
AGCCTATGGG 
AGCTGATGGC 
CAAATTTCTT 
ATGGGTGGAG 
TGCGTTAAGT 
TCGTAGGTCA 
ATCACCAAGT 
TCGAGTTACC 
AATTGCCTAA 
CTCGAAGGGA 
GCAGATCATT 
GCATCCAAGA 
CTCAACGTTG 
TCGACCAGAT 
GTCACAATCC 
AC C TC TAT AG 
CAGCCATACT 
GTGATATTCG 
TTACGATTTT 



GTTCTTAATT 
CTGTAGAAGT 
ATTTTTCCTT 
TTCAGGGGAT 
CTTCCAGTTG 
GGTGGGGTTT 
GATTAGTAGT 
CAGGATTTAG 
ACCTCAGCGA 
GCCCATGCTC 
CTGCAGGATT 
ACGAAAAAGA 
GGCCCTCACG 
TGATTTTCTC 
ACCGGAGGAT 
TAATAATAGC 
CATTTTCTAA 
CAAAACTCCT 
TCCTTTGTTA 
CCCCACCAAC 
TTCCTAGAAA 
GATCTCTATA 
GAAAAGGAGG 
GCAAATCAAG 
GACACCTACT 
CTCTAGGAGC 
TCTGATGATT 
CAAAGCCAGT 
AAACCCTCAC 
TTAAACCAAA 
TGCTACCTTA 
TCATGGATGC 
GACGGTGCTA 
TGGCACTAAA 
CTATATCCGG 
AACCACGGGA 
CAAAGCGACT 
ACGGCTATCA 
ATAGACACGA 
TCTTCCTCAT 
CAAACGTCAT 
GAAGATGTCC 
CCATGCGAAT 
GCTACCTCAT 
CTAGGTTTTG 
CGGTCATTCT 
CTCTGTTTGG 
TATAGCCGTA 
AGGGCGCTGC 
ACCTTCCCAT 
CCCTTTGTAA 
AAATACCCCC 
CAGTTCCCGT 
TTTTAC AC T A 
TGATTGCGAT 
GGAATAATCT 
TCAGTAAATG 
CAGAACCTCC 
AA 



TCTACCACAC 
TATCATGCCT 
ACACAACACT 
CTCTACATTG 
CTTTAGCAAT 
TCTCCTTCTT 
GTAATCACCC 
TCAGATGATC 
GTAATGTCAT 
TTTACAAACA 
TGGAGCTGCC 
GCCTTCTCTT 
GGATCTGCAG 
AACGAATGCT 
CTATGCTCAC 
TCGCGCTCAG 
TAACAGCGAC 
TACCTGCACC 
GGATATGGAG 
AGGTGTTAGC 
ACATTGCCTC 
GATTCTAATA 
CGCTATTGCT 
GTGATATCCT 
CGCAATAGTA 
TACGCAAGGC 
TATCTGCTGC 
GCAGATGGTG 
TGCTACCGAA 
AGCTAGAACT 
AATGTTCATA 
AGGGACCACA 
TCACCTTAAA 
GCGGCTCTCG 
AACTTTAGGA 
TGTTTAATAA 
TCAAATACTG 
GCAATCTCCC 
CAACCCATAC 
CCGGAGCGTC 
AGATTTACGA 
CTGGGAAGCA 
CATACCGGTG 
CAATACCTAC 
GACAGCTGTT 
AACGTCTATT 
ATCATCGAGA 
GCAATGAGAA 
TCTTGGAGTA 
CACTCTCTCT 
AAGCTGAAGT 
GAGGGGAGGA 
AGGCGTCCGC 
TAATCGTAGC 
ACGACATTAC 
AACCAGAAGT 
ATGTTCTAGA 
CGTCAATATA 



TGGCGCCATG 
TCCGAGAACT 
TTCTGATCCT 
CGAATCTTGA 
AGGGCGGGAG 
AAATATCCGT 
AAAATC C TG A 
TTCGATAACT 
ACCTCACGCA 
ATGACTCCAT 
ATTCGAGGCA 
TAATGGTAAT 
CGATCAACCT 
ACAGGGATCT 
CTCTGGGAAC 
GAGGCGCTAT 
CTGACTTTCC 
TACACCTCCA 
GCGCCATCTT 
CTGACTATAT 
CGAACAAGGA 
AATCTACAAT 
ATTCCCGAAT 
CTTTAACAAG 
TTCACTTCGG 
TATACCCTAT 
ATCCGCAGCC 
CGTATTCAGG 
GCAGCAACCC 
TGAAGGCGGT 
ACTTCACGCA 
TTAGCAACTA 
CAAGCTTGTA 
TTAATGTGCA 
CTTGTGAAAA 
AGATTTACAG 
TAACCACTAC 
TATGGGTATC 
GGTCACAGGA 
TTGCTCCCCT 
GCTGTAAGTC 
ACTGAGCATC 
ATGCACGCAG 
ACACGCATCA 
TACAAAATCT 
TCGCTACAGT 
TTCTTCTCAG 
AGTAAAGACT 
ACAATTGCTG 
TCTCGCATCT 
TGCTTACGCG 
TTTTTGGACA 
TTTGGTAAAA 
CTATGCTCCT 
CTATTAATGG 
ACTTTGCTAG 
GATCTTCGGG 
CTCTAGATAT 



The PSORT algorithm predicts an outer membrane location (0.917). 
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The protein was expressed in Ecoli and purified as a his-tag product, as shown in Figure 2A, and 
also as a GST-fusion. Both proteins were used to immunise mice, whose sera were used in a Western 
blot (Figure 2B) and for FACS analysis (Figure 2C). 

The cp6736 protein was also identified in the 2D-PAGE experiment (Cpn0453) and showed good 
5 cross-reactivity with human sera, including sera from patients with pneumonitis. 

These experiments show that cp6736 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 3 

The following ^pneumoniae protein (pid 4376751) was expressed <SEQ ID 5; cp6751>: 

10 1 MRFFCFGMLL FFTFVIiAK EG LQLPLETYIT LSPEYQAAPQ VGFTHNQNQD 



51 LAIVGNHNDF ILDYKYYRSN GGALTCKMLB ISENIGNVFF EKNVCPNSGG 

101 AIYAAQNCTI SKNQNYAFTT NLVSDWPTAT AGSLLGGALF AINCSXTNNL 

151 GQGTFVDNLA LNKGGALYTE THLSIKDNKG PIIIKQNRAL NSDSLGGGIY 

201 SGNSLNISGN SGAIQITSNS SGSGGGIFST QTLTISSNKK LIEXSENSAF 

251 ANNYGSNFNP GGGGLTTTFC TILNNREGVXj FHNNQSQSNG GAXHAKSIII 

3 01 KENGFVYFLN NTATRGGALL NLSAGSGNGS FXItSADNGDI IFNNNTASKH 

351 ALNPPYRNAI HSTPNMNLQI GARPGYRVLF YDPIEHELPS SFFXLFNFET 

401 GHTGTVXFSG EHVHQNFTDE MNFFSYLRNT SELRQGVLAV EDGAGliACYK 

451 FFQRGGTLLt, GQGAVITTAG TIPTPSSTPT TVG ST ITL»NH IAIDLPSIIiS 

501 FQAQAFKXWI YPTKTGSTYT EDSNPTITXS GTLTLRNSNN EDPYDSLDLS 

551 HSLEKVPIiLY XVDVAAQKIN SSQLDLSTLN SGEHYGYQGX WSTYWVETTT 

601 ITNPTSLLiGA NTKHKIiLYAN WSPLGYRPHP ERRGEF I TUf A LWQSAYTAIiA 

651 GLHSLSSWDE EKGHAASLQG IGXitiVHQKDK NGFKGFRSHM TGYSATTEAT 

701 SSQSPWFSLG FAQFFSKAKE HESQWSTSSH HYFSGMCXEN TIiFKEWIRIjS 

751 VSLAYMFTSE HTHTMYQGLL EGNSQGSFHN HTLAGALSCV FLPQPHGESL 

801 QXYPFITALA IRGNLAAFQE SGDHAREF S h HRPLTDVSLP VGIRASWKNH 

851 HRVPLVWLTE ISYRSTLYRQ DPELHSKLLI SQGTWTTQAT PVTYNALGIK 

901 VKNTMQVFPK VTLSLDYSAD ISSSTLSHYL NVASRMRF* 



A predicted signal peptide is highlighted. 



30 



The cp675 1 nucleotide sequence <SEQ ID 6> is: 



l 



ATGCGCTTTT 
TAATGAAGGT 
AATATCAAGC 
CTCGCAATTG 
TCGGTCGAAT 
ATATAGGGAA 
GCAATTTATG 
ATTTACTACA 
TATTGGGTGG 
GGACAGGGAA 
CTATACTGAG 
TCAAGCAGAA 
AGTGGGAACT 
AAGCAACTCT 
CGATCTCCTC 
GCAAATAACT 
CACCTTTTGC 
ACCAAAGCCA 
AAAGAAAATG 
GGCTCTCCTC 
CTGCAGATAA 
GCCCTCAATC 
TCTGCAAATA 
TAGAACATGA 
GGTCATACAG 



TTTGCTTCGG AATGTTGCTT 
CTCCAACTTC CTTTGGAGAC 
AGCCCCTCAA GTAGGGTTTA 
TCGGGAATCA CAATGATTTC 
GGAGGTGCTC TTACCTGTAA 
TGTCTTCTTT GAGAAGAATG 
CTGCTCAAAA TTGCACGATC 
AACTTGGTCT CTGACAATCC 
AGCTCTCTTT GCCATAAATT 
CTTTCGTTGA CAATCTCGCT 
ACGAACTTAT C TATTAAAGA 
TCGGGCACTA AATTCGGACA 
CTCTAAATAT AGAGGGAAAT 
TCAGGATCTG GGGGAGGCAT 
GAATAAAAAA CTCATAGAAA 
ATGGATCGAA CTTCAATCCA 
ACGATATTGA ACAACCGAGA 
GAGCAACGGT GGAGCCATTC 
GTCCTGTATA CTTTTTAAAT 
AACTTATCAG CAGGTTCTGG 
TGGAGATAM ATCTTTAACA 
CTCCATACAG AAACGCCATT 
GGAGCCCGTC CCGGCTATCG 
GCTCCCTTCC TCCTTCCCCA 
GTACAGTTTT ATTTTCAGGG 



CCTTTTACTT TTGTATTGGC 
CTATATTACA TTAAGTCCTG 
CTCATAACCA AAATCAAGAT 
ATCTTGGACT ATAAGTACYA 
GAATCTTCTG ATCTCTCAAA 
TCTGTCCCAA TTCTGGCGGG 
TCCAAGAATC AGAACTATGC 
TACAGCCACT GCGGGATCAC 
GCTCTATTAC TAATAACCTA 
TTAAATAAGG GGGGTGCCCT 
CAATAAAGGC CCGATCATAA 
GTTTAGGAGG AGGGATTTAT 
TCTGGAGCTA TACAGATCAC 
ATTTTCTACC CAAACACTCA 
TCAGTGAAAA TTCCGCGTTC 
GGAGGAGGAG GTCTTACTAC 
AGGGGTACTC TTTAACAATA 
ATGCGAAATC TATCATTATC 
AACACTGCAA CTCGGGGAGG 
AAACGGAAGC TTCATCTTAT 
ATAATACGGC CTCCAAGCAT 
CACTCGACTC CTAATATGAA 
AGTGCTGTTC TATGATCCCA 
TACTCTTTAA TTTCGAAACC 
GAACATGTAC ACCAGAACTT 
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35 



50 



55 



51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
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1251 TACCGATGAA ATGAATTTCT TTTCCTATTT AAGGAACACT TCGGAACTAC 

1301 GTCAAGGAGT CCTTGCTGTT GAAGATGGTG CGGGGCTGGC CTGCTATAAG 

1351 TTCTTCCAAC GAGGAGGCAC TCTACTTCTA GGTCAAGGTG CGGTGATCAC 

1401 GACAGCAGGA ACGATTCCCA CACCATCCTC AACACCAACG ACAGTAGGAA 

5 1451 GTACTATAAC TTTAAATCAC ATTGCCATTG ACCTTCCTTC TATTCTTTCT 

1501 TTTCAAGCTC AGGCTCCAAA AATTTGGATT TACCCCACAA AAACAGGATC 

1551 TACCTATACT GAAGATTCCA ACC CGACAAT CACAATCTCA GGAACTCTCA 

1601 CCTTACGCAA CAGCAACAAC GAAGATCCCT ACGATAGTCT GGATCTCTCG 

1651 CACTCTCTTG AGAAAGTTCC CCTTCTTTAT ATTGTCGATG TCGCTGCACA 

10 1701 AAAAATTAAC TCTTCGCAAC TGGATCTATC CACATTAAAT TCTGGCGAAC 

1751 ACTATGGGTA TCAAGGCATC TGGTCGACCT ATTGGGTAGA AACTACAACA 

1801 ATCACGAACC CTACATCTCT ACTAGGCGCG AATACAAAAC ACAAGCTCCT 

1851 CTATGCAAAC TGGTCTCCTC TAGGCTACCG TCCTCATCCC GAACGfFCGAG 

1901 GAGAATTCAT TACGAATGCC TTGTGGCAAT CGGCATATAC GGCTCTTGCA 

15 1951 GGACTCCACT CCCTCTCCTC CTGGGATGAA GAGAAGGGTC ATGCAGCTTC 

2001 CCTACAAGGC ATTGGTCTTC TGGTTCATCA AAAAGACAAA AACGGTTTTA 

2051 AGGGATTTCG TAGTCATATG ACAGGTTATA GTGCTACCAC CGAAGCAACC 

2101 TCTTCTCAAA GTCCGAATTT CTCTTTAGGA TTTGCTCAGT TCTTCTCCAA 

2151 AGCTAAAGAA CATGAATCTC AAAATAGCAC GTCCTCTCAC CACTATTTCT 

20 2201 CTGGAATGTG CATAGAAAAT ACTCTCTTCA AAGAGTGGAT ACGTCTATCT 

2251 GTGTCTCTTG CTTATATGTT TACCTCGGAA CATACCCATA CAATGTATCA 

2301 GGGTCTCCTG GAAGGGAACT CTCAGGGATC TTTCCACAAC CATACCTTAG 

2351 CAGGGGCTCT CTCCTGTGTT TTCTTACCTC AACCTCACGG CGAGTCC CTG 

2401 CAGATCTATC CCTTTATTAC TGC CTTAGCC ATCCGAGGAA ATCTTGCTGC 

25 2451 GTTTCAAGAA TCTGGAGACC ATGCTCGGGA ATTTTCCCTA CACCGCCCCC 

2501 TAACGGACGT CTCCCTCCCT GTAGGAATCC GCGCTTCTTG GAAGAACCAC 

2551 CACCGAGTTC CCCTAGTCTC GCTCACAGAA ATTTCCTATC GCTCTACTCT 

2601 CTATAGGCAA GATCCTGAAC TCCACTCGAA ATTACTGATT AGCCAAGGTA 

2651 CGTGGACGAC GCAGGCCACT CCTGTGACCT ACAATGCTTT AGGGATCAAA 

30 27 01 GTGAAAAATA CCATGCAGGT GTTTCCTAAA GTCACTCTCT CCTTAGATTA 

2751 CTCTGCGGAT ATTTCTTCCT CCACGCTGAG TCACTACTTA AACGTGGCGA 

2801 GTAGAATGAG ATTTTAA 

The PSORT algorithm predicts an outer membrane location (0.923). 

The protein was expressed in Kcoli and purified as a GST-fusion product, as shown in Figure 3 A, 
35 and also in his-tagged form. The GST-fusion recombinant protein was used to immunise mice, whose 
sera were used in a Western blot (Figure 3B) and for FACS analysis (Figure 3C). 

This protein also showed good cross-reactivity with human sera, including sera from patients with 
pneumonitis. 

These experiments show that cp6751 is a surface-exposed and immunoaccessible protein, and that it 
40 is a useful immunogen. These properties are not evident from the sequence alone. 

Example 4 

The following C.pneumoniae protein (pid 437 6752) was expressed <SEQ ID 7; cp6752>: 

1 MFGMTPAVYS LQTDSLEKFA LERDEEFRTS FPLLDSLSTL TGFSPITTFV 

51 GNRHNSSQDI VIiSNYKSIDN ILLLWTSAGG AVSCNNFLLS NVEDHAFFSK 

45 101 N3UA1GTGGA1 ACQGACTITK NRGPLIFFSN RGLNNASTGG ETRGGAIACN 

151 GDFTISQNQG TFYFVNNSVW NWGGAI,STNG HCRIQSNRAF LLFFNNTAPS 

201 GGGALRSENT TISDNTRPIY FKNNCGNNGG AIQTSVTVAI KNNSGSVIFN 

251 NNTALSGSIN SGNGSGGAIY TTNLSIDDNP GT I JjFNWNYC IRDGGAXCTQ 

301 FLTIKNSGHV YFTNNQGNWG GALMLliQDST CXiIrFAEQGNI AFQNHEVFLT 

50 351 TFGRYNAIHC TPNSNLQLGA NKGYT^AFFD PIEHQHPTTKT PL»IFNPNAMH 

401 QGTILFSSAY IPEASDYEHN FISSSKNTSE L.RNGVLSIED RAGWQFYKFT 

451 QKGGILKLGH AASIATTANS ETPSTSVGSQ VI INNLAINL PSILAKGKAP 

501 TLWIRPLQSS APFTEDNNPT iTLSGpLTLtli NEENRDPYDS IDLSBPLQNI 

551 HLLSLSDVTA RHINTDNFHP ESLNATEHYG YQGIWSPYWV ETITOTNNAS 

55 601 IETANTLYRA LYANWTPLGY KVNPEYQGDL ATTPLWQSFH TMFSLLRSYN 

651 RTGDSDIERP FLEIQGIADG LFVHQNSIPG APGFRIQSTG YSLQASSETS 
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701 LHQKISLGFA QFFTRTKEIG SSNNVSAHNT VSSLYVEIiPW FQEAFATSTV 

751 LAYGYGDHHL HSLHPSHQEQ AEGTCYSHT3J AAAIGCSFPW QQKSYLHkSP 

801 FVQAIAIRSH QTAFEE IGDN PRKFVSQKPF YNLTLPLGIQ GKWQSKFHVP 

851 TEWTLELSYQ PVLYQQNPQI GVTLLASGGS WDILGHNYVR NALGYKVHNQ 

901 TALFRSLDLF LDYQGSVSSS TSTHHLQAGS TLKF* 



The cp6752 nucleotide sequence <SEQ ID 8> is: 
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15 



20 
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51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 
2701 
2751 
2801 



ATGTTCGGGA 

AAAGTTTGCT 

TAGACTCTCT 

GGAAATAGAC 

TATTGATAAC 

GTAATAATTT 

AATCTCGCGA 

AATCACGAAG 

ACAATGCGAG 

GGAGACTTCA 

TTCCGTCAAC 

TCCAAAGCAA 

GGAGGGGGTG 

TCCTATTTAT 

CAAGCGTTAC 

AACAACACAG 

GGCGATTTAT 

TTTTCAATAA 

TTTTTGACAA 

AAACTGGGGA 

TCGCGGAACA 

ACATTTGGTA 

ACTTGGAGCT 

ACCAACATCC 

CAGGGAACGA 

CGAAAATAAT 

GTGTCCTCTC 

CAAAAAGGAG 

TGCCAACTCT 

ATAACCTTGC 

ACCTTGTGGA 

TAACCCTACA 

ACCGCGATCC 

CATCTTCTTT 

CTTTCATCCT 

TCTGGTCTCC 

ATAGAGACGG 

CTTAGGATAT 

CCCTATGGCA 

CGAACTGGTG 

TGCCGACGGC 

TCCGTATCCA 

TTACATCAGA 

AGAAATCGGA 

TTTATGTTGA 

TTAGCGTATG 

TCAAGAACAG 

TCGGCTGTTC 

TTCGTTCAGG 

TGGTGACAAT 

CCTTACCTCT 

ACAGAATGGA 

TCCCCAAATC 

TAGGCCATAA 

ACTGCGCTCT 

CTCCTCCTCG 

TCTAA 



TGACTCCTGC 

TTAGAGAGGG 

CTCCACTCTT 

ATAATTCCTC 

ATCCTTCTTC 

CTTATTATCA 

TTGGGACTGG 

AATAGAGGAC 

TACAGGAGGA 

CGATTTCTCA 

AACTGGGGAG 

CAGGGCACCT 

CGCTTCGTAG 

TTTAAGAACA 

TGTTGCGATA 

CGTTATCTGG 

ACAACAAACC 

TAACTACTGC 

TCAAAAATAG 

GGTGCTCTTA 

AGGAAATATC 

GATACAACGC 

AATAAGGGGT 

AACTACAAAT 

TCTTATTTTC 

TTCATTAGCA 

TATCGAGGAT 

GTATCCTTAA 

GAGACTC CAT 

GATTAACCTC 

TCCGTCCTCT 

ATTACTTTAT 

CTACGACAGT 

CTTTATCGGA 

GAAAGCTTAA 

TTATTGGGTA 

CAAACACCCT 

AAGGTCAATC 

ATCCTTTCAT 

ATTCTGATAT 

CTCTTTGTTC 

ATCTACAGGG 

AAATCTCCTT 

TCAAGCAACA 

GCTTCCGTGG 

GCTATGGGGA 

GCAGAAGGGA 

TTTCCCTTGG 

CAATTGCAAT 

CCCCGAAAGT 

AGGAATCCAA 

CTCTAGAACT 

GGTGTCACGC 

CTATGTTCGC 

TCCGTTCTCT 

ACATCTACGC 



AGTGTATAGT 

ATGAAGAGTT 

ACAGGATTTT 

TCAAGACATT 

TTTGGACATC 

AATGTTGAAG 

AGGCGCGATT 

CCCTTATTTT 

GAAACTCGTG 

AAATCAAGGG 

GAGCCCTCTC 

CTACTCTTTT 

TGAAAATACA 

ACTGTGGGAA 

AAAAATAACT 

TTCGATAAAT 

TATCCATAGA 

ATTCGCGATG 

TGGCCACGTA 

TGCTCCTACA 

GCATTTGAAA 

CATACATTGT 

ATACGACTGC 

CCTCTAATCT 

TTCAGCCTAT 

GCTCGAAAAA 

CGTGCGGGAT 

ATTAGGGCAT 

CAACTAGTGT 

CCCTCGATCT 

ACAATCTAGT 

CAGGTCCTCT 

ATAGATCTCT 

TGTAACAGCA 

ATGCGACTGA 

GAGACGATAA 

CTACAGAGCT 

CTGAATACCA 

ACTATGTTCT 

CGAGAGGCCT 

ATCAAAATAG 

TATTCCTTAC 

AGGTTTTGCA 

ACGTCTCGGC 

TTCCAAGAGG 

CCATCACCTC 

CGTGTTATAG 

CAACAGAAAT 

ACGTTCTCAC 

TTGTCTCTCA 

GGAAAATGGC 

TTCTTACCAA 

TACTTGCGAG 

AATGCTTTAG 

CGA^CTATTC 

ACCATCTCCA 



TTACAAACGG 
TCGTACGAGC 
CTCCAATAAC 
GTACTTTCTA 
GGCTGGGGGA 
ACCATGCCTT 
GCTTGCCAGG 
TTTCAGCAAT 
GGGGTGCGAT 
ACTTTCTACT 
CACCAATGGA 
TTAACAATAC 
ACGATCTCTG 
CAATGGCGGG 
CCGGGTCGGT 
TCAGGAAATG 
CGATAACCCT 
GCGGAGCTAT 
TATTTCACCA 
GGACAGCACC 
ATAATGAGGT 
ACACCAAATA 
TTTTTTTGAT 
TTAATCCCAA 
ATCCCAGAAG 
TACCTCTGAA 
GGCAATTCTA 
GCGGCGAGTA 
AGGCTCCCAG 
TAGCAAAAGG 
GCTCCTTTCA 
GACACTCTTA 
CTGAGCCTTT 
CGTCATATCA 
GCATTACGGT 
CAACAACAAA 
CTGTATGCCA 
AGGAGATCTT 
CTCTATTAAG 
TTCTTAGAAA 
CATCCCCGGG 
AAGCATCCTC 
CAGTTCTTCA 
TCACAATACA 
CCTTTGCAAC 
CACAGCCTAC 
CCATACATTA 
CCTATCTTCA 
CAAACAGCGT 
AAAGCCTTTC 
AGTCAAAATT 
CCGGTACTCT 
CGGAGGTTCC 
GGTACAAAGT 
TTGGATTACC 
AGCAGGAAGT 



ACTCCCTTGA 
TTTCCTCTCT 
TACGTTTGTT 
ACTACAAGTC 
GCTGTGTCCT 
CTTCAGTAAA 
GAGCCTGCAC 
CGAGGTCTTA 
TGCCTGTAAT 
TTGTCAACAA 
CACTGCCGCA 
AGCCCCTAGT 
ATAACACGCG 
GCCATTCAAA 
GATTTTCAAT 
GTTCAGGAGG 
GGAACTATTC 
CTGTACACAA 
ACAATCAAGG 
TGCCTACTCT 
TTTCCTCACC 
GCAACTTACA 
CCTATAGAAC 
TGCGAACCAT 
CTTCTGACTA 
CTTCGCAATG 
TAAGTTCACT 
TTGCAACAAC 
GTCATCATTA 
AAAAGCTCCT 
CAGAGGACAA 
AATGAGGAAA 
ACAAAACATT 
ATACCGATAA 
TATCAAGGCA 
TAACGCTTCT 
ATTGGACTCC 
GCTACGACTC 
AAGTTATAAT 
TTCAAGGGAT 
GCTCCAGGAT 
CGAAACTTCT 
CCCGCACTAA 
GTCTCTTCAC 
ATCCACAGTG 
ATCCCTCACA 
GCAGCAGCTA 
CCTCAGCCCG 
TCGAAGAGAT 
TATAATCTGA 
CCACGTACCT 
ATCAACAAAA 
TGGGATATCC 
CCACAATCAA 
AAGGATCGGT 
ACCTTAAAAT 



The PSORT algorithm predicts a cytoplasmic location (0.138). 
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The protein was expressed in E-coli and purified as a his-tag product, as shown in Figure 4A, and 
also as a GST-fusion. The recombinant proteins were used to immunise mice, whose sera were used 
in a Western blot (4B) and the his-tagged protein was used for FACS analysis (4C). 

The cp6752 protein was also identified in the 2D-PAGE experiment (Cpn0467). 

5 These experiments show that cp6752 is a surface-exposed and immunoaccessibJe protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 5 

The following C.pneumoniae protein (pid 437 685 0) was expressed <SEQ ID 9; cp6850>: 

1 MKKfiVIilAAM FCGWSLSSC CRIVDCCPED PCAPSSCNPC EVIRKKERSC 
10 51 GGNACGSYVP SCSNPCGSTE CNSQSPQVKG CTSPDGRCKQ * 

A predicted signal peptide is highlighted. 

The cp6S50 nucleotide sequence <SEQ ID 10> is: 

1 ATGAAGAAAG CTGTTTTAAT TGCTGCAATG TTTTGTGGAG TAGTTAGCTT 

51 AAGTAGCTGC TGCCGCATTG TAGATTGTTG TTTTGAGGAT CCTTGCGCAC 

15 101 CCTCTTCTTG CAATCCTTGT GAAGTAATAA GAAAAAAAGA AAGATCTTGC 

151 GGCGGTAATG CTTGTGGGTC CTACGTTCCT TCTTGTTCTA ATCCATGTGG 

2 01 TTCAACAGAG TGTAACTCTC AAAGCCCACA AGTTAAAGGT TGTACATCAC 
251 CTGATGGCAG ATGCAAACAG TAA 

The PSORT algorithm predicts an inner membrane location (0.329). 

20 The protein was expressed in Kcoli and purified as a GST-fusion product, as shown in Figure 5A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 5B) and for FACS analysis (Figure 5B). A his-tagged protein was also expressed. 

These experiments show that cp6850 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

25 Example 6 

The following C.pneumoniae protein (pid 4376900) was expressed <SEQ ID 1 1; cp6900>: 

1 MKIKPSWKVN FLICIiloAVGL IFFGCSKVKR EVLVGRDATW FPKQFGIYTS 

51 DTNAFLNDLV SEINYKENLN XNIVNQDWVH L.FENLDDKKT QGAFTSVLPT 

101 LEMLEHYQFS DP I LtLTGPVL WAQDSPYQS IEDLKGRLIG VYKFDSSVIiV 

30 151 AQNIPDAVIS IiYQHVPIALE ALTSNCYDAL LAPVIEVTAL IETAYKGRLK 

201 IISKPLNADG L.RLAILKGTN GDLLEGFHAG LVKTRRSGKY DAIKQRYRLP 

The cp6900 nucleotide sequence <SEQ ID 12> is: 

1 GTGAAGATAA AATTTTCTTG GAAGGTAAAT TTTTTAATAT GTTTACTGGC 

51 TGTGGGACTG ATCTTTTTCG GGTGCTCTCG AGTAAAAAGA GAAGTTCTCG 

35 101 TAGGTCGTGA TGCCACCTGG TTTCCAAAAC AATTCGGCAT TTATACATCC 

151 GATACCAACG CATTTTTAAA CGATCTTGTT TCTGAGATTA ACTATAAAGA 

201 GAATCTAAAT ATTAATATTG TAAATCAAGA TTGGGTGCAT CTCTTTGAGA 

251 ATTTAGATGA TAAAAAGACC CAAGGAGCAT TTACATCTGT ATTGCCTACT 

3 01 CTTGAGATGC TCGAACACTA TCAATTTTCT GATCCCATTT TACTCACAGG 
40 351 TCCTGTCCTT GTCGTCGCTC AAGACTCTCC TTACCAATCT ATAGAGGATC 

401 TTAAAGGTCG TCTTATTGGA GTGTATAAGT TTGACTCTTC AGTTCTTGTA 

451 GCTCAAAATA TCCOTGACGC TGTGATTAGC CTCTACCAAC ATGTTCCAAT 

501 AGCATTGGAA GCCTTAACAT CGAATTGTTA CGACGCTCTT CTAGCTCCTG 

551 TAATTGAAGT GACCGCGCTA ATAGAAACAG CATATAAAGG AAGACTGAAA 

45 601 ATTATTTCAA AACCCTTAAA CGCAGATGGT TTGCGGCTTG CAATACTGAA 
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651 AGGGACAAAC GGAGATTTGC TTGAAGGGTT TAACGCAGGA CTTGTGAAAA 
701 CACGACGCTC AGGAAAATAC GATGCTATAA AACAGCGGTA TCGTCTTCCC 
751 TAA 

The PSORT algorithm predicts an inner membrane location (0.452). 

5 The protein was expressed in Rcoli and purified as a GST-fusion product, as shown in Figure 6A. 
The recombinant protein was used to immunise mice, whose sera were used for FACS analysis 
(Figure 6B). A his-tagged protein was also expressed. 

The cp6900 protein was also identified in the 2D-PAGE experiment (Cpn0604). 

These experiments show that cp6900 is a surface-exposed and immunoaccessible protein, and that it 
10 is a useful immunogen. These properties are not evident from the sequence alone. 

Example 7 

The following C.pneumoniae protein (pid 4377 03 3) was expressed <SEQ ID 13; cp7033>: 

1 MVNPIGPGPI DETERTPPAD LSAQGLEASA ANKSAEAQRI AGAEAKPKES 

51 KTDSVERWSI LRSAVNALMS LADKLGIASS NSSSSTSRSA DVDSTTATAP 

15 101 TPPPPTFDDY KTQAQTAYDT IFTSTSLADI QAALVSLQDA VTNIKDTAAT 

151 DEETAIAAEW ETKNADAVKV GAQI TEIjAKY ASDNQAILDS LGKLTSFDXjL 

201 QAALLQSVAN NNKAAELLKE MQDNPWPGK TPAIAQSDVD QTDATATQIE 

2 51 KDGNAIRDAY FAGQNASGAV ENAKSNNSIS NIDSAKAAIA TAKTQIAEAQ 

3 01 KKFPDSPILQ EAEQMVTQAE KDLKNIKPAD GSDVPNPGTT VGGSKQQGSS 
20 351 IGSIRVSMLL DDAENETASI IiMSGFRQMIH MFNTENPDSQ AAQQEI>AAQA 

401 RAAKAAGDDS AAAALADAQK ALEAALGKAG QQQGILNALG QIASAAWSA 

451 GVPPAAASSI GSSVKQLYKT SKSTGSDYKT QISAGYDAYK SINDAYGRAR 

501 NDATRDVINN VSTPALTRSV PRARTEARGP EKTDQAIiARV ISGNSRTLGD 

551 VYSQVSALQS VMQIIQSNPQ ANNEEIRQKL TSAVTKPPQF GYPYVQI>SND 

25 601 STQKFIAKIiE SLFAEGSRTA AEIKALSFET NSLFIQQVLV NIGSLYSGYL 

651 Q* 

The cp7033 nucleotide sequence <SEQ ID 14> is: 

1 ATGGTTAATC CTATTGGTCC AGGTCCTATA GACGAAACAG AACGCACACC 

51 TCCCGCAGAT CTTTCTGCTC AAGGATTGGA GGCGAGTGCA GCAAATAAGA 

30 101 GTGCGGAAGC TCAAAGAATA GCAGGTGCGG AAGCTAAGCC TAAAGAATCT 

151 AAGACCGATT CTGTAGAGCG ATGGAGCATC TTGCGTTCTG CAGTGAATGC 

201 TCTCATGAGT CTGGCAGATA AGCTGGGTAT TGCTTCTAGT AACAGCTCGT 

251 CTTCTACTAG CAGATCTGCA GACGTGGACT CAACGACAGC GACCGCACCT 

3 01 ACGCCTCCTC CACCCACGTT TGATGATTAT AAGACTCAAG CGCAAACAGC 

35 351 TTACGATACT ATCTTTACCT CAACATCACT AGCTGACATA CAGGCTGCTT 

401 TGGTGAGCCT CCAGGATGCT GTCACTAATA TAAAGGATAC AGCGGCTACT 

451 GATGAGGAAA CCGCAATCGC TGCGGAGTGG GAAACTAAGA ATGCCGATGC 

501 AGTTAAAGTT GGCGCGCAAA TTACAGAATT AGCGAAATAT GCTTCGGATA 

551 ACCAAGCGAT TCTTGACTCT TTAGGTAAAC TGACTTCCTT CGACCTCTTA 

40 601 CAGGCTGCTC TTCTCCAATC TGTAGCAAAC AATAACAAAG CAGCTGAGCT 

651 TCTTAAAGAG ATGCAAGATA AC CC AGTAGT CCCAGGGAAA ACGCCTGCAA 

701 TTGCTCAATC TTTAGTTGAT CAGACAGATG CTACAGCGAC ACAGATAGAG 

751 AAAGATGGAA ATGCGATTAG GGATGCATAT TTTGCAGGAC AGAACGCTAG 

801 TGGAGCTGTA GAAAATGCTA AATCTAATAA CAGTATAAGC AACATAGATT 

45 851 CAGCTAAAGC AGCAATCGCT ACTGCTAAGA CACAAATAGC TGAAGCTCAG 

901 AAAAAGTTCC CCGACTCTCC AATTCTTCAA GAAGCGGAAC AAATGGTAAT 

951 ACAGGCTGAG AAAGATCTTA AAAATATCAA ACCTGCAGAT GGTTCTGATG 

1001 TTCCAAATCC AGGAACTACA GTTGGAGGCT CCAAGCAACA AGGAAGTAGT 

1051 ATTGGTAGTA TTCGTGTTTC CATGCTGTTA GATGATGCTG AAAATGAGAC 

50 1101 CGCTTCCATT TTGATGTCTG GGTTTCGTCA GATGAOTCAC ATGTTCAATA 

1151 CGGAAAATCC TGATTCTCAA GCTGCCCAAC AGGAGCTCGC AGCACAAGCT 

1201 AGAGCAGCGA AAGCCGCTGG AGATGACAGT GCTGCTGCAG CGCTGGCAGA 

1251 TGCTCAGAAA GCTTTAGAAG CGGCTCTAGG TAAAGCTGGG CAACAACAGG 

1301 GCATACTCAA TGCTTTAGGA CAGATCGCTT CTGCTGCTGT TGTGAGCGCA 

55 13 51 GGAGTTCCTC CCGCTGCAGC AAGTTCTATA GGGTCATCTG TAAAACAGCT 

1401 TTACAAGACC TCAAAATCTA CAGGTTCTGA TTATAAAACA CAGATATCAG 
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1451 CAGGTTATGA TGCTTACAAA TCCATCAATG ATCCCTATGG TAGGGCACGA 
1501 AATGATGCGA CTCGTGATGT GATAAACAAT GTAAGTACCC CCGCTCTCAC 
1551 ACGATCCGTT CCTAGAGCAC GAACAGAAGC TCGAGGACCA GAAAAAACAG 
1601 ATCAAGCCCT CGCTAGGGTG ATTTCTGGCA ATAGCAGAAC TCTTGGAGAT 
~> 1651 GTCTATAGTC AAGTTTCGGC ACTAC AATC T GTAATGCAGA TCATCCAGTC 

1701 GAATCCTCAA GCGAATAATG AGGAGATCAG ACAAAAGCTT ACATCGGCAG 
1751 TGACAAAGCC TCCACAGTTT GGCTATCCTT ATGTGCAACT TTCTAATGAC 
1801 t TCTACACAGA AGTTCATAGC TAAATTAGAA AGTTTGTTTG CTGAAGGATC 
1851 TAGGACAGCA GCTGAAATAA AAGCACTTTC CTTTGAAACG AACTCCTTGT 
1U 1901 TTATTCAGCA GGTGCTGGTC AATATCGGCT CTCTATATTC TGGTTATCTC 

1951 CAATAA 

The PSORT algorithm predicts a cytoplasmic location (0.272). 

The protein was expressed in Kcoli and purified as a GST-fusion product, as shown in Figure 7 A. A 
his-tagged protein was also expressed. The recombinant proteins were used to immunise mice, whose 
15 sera were used for FACS (Figure 7B) and Western blot (7C) analyses. 

The cp7033 protein was also identified in the 2D-PAGE experiment (Cpn0728) and showed good 
cross-reactivity with human sera, including sera from patients with pneumonitis. 

These experiments show that cp7033 a surface-exposed and immunoaccessible protein, and that it is 
a useful immunogen. These properties are not evident from the sequence alone. 

20 Example 8 

The following C.pnewnoniae protein (pid 6172321) was expressed <SEQ ID 15; cp0017>: 

1 MGXKGTGIIV WVDDATAKTK NATLTWTKTG YKPNPERQGP LVPNSLWGSF 

51 VDVRSIQSLM DRSTSSLSSS TNLWVSGIAD FLHEDQKGNQ RSYRHSSAGY 

101 ALGGGFFTAS ENFFNFAFCQ LFGYDKDHLV AKNHTHVYAG AMSYRHLGES 

25 151 KTLAKILSGN SDSLPFVFNA RFAYGHTDNN MTTKYTGYSP VKGSWGNDAF 

201 GIECGGAIPV VASGRRSWVD THTPFLWLEM IYAHQNDFKE NGTEGRSFQS 

251 EDLFNLAVPV GIKFEKFSDK STYDLSIAW PDVIRMDPGC TTTLMVSGDS 

301 WSTCGTSLSR QALLVRAGNH HAFASNFEVF SQFEVELRGS SRSYAIDLGG 

351 RFGF* 

30 The cp0017 nucleotide sequence <SEQ ID 16> is: 

1 ATGGGTATCA AGGGAACTGG AATAATTGTO TGGGTCGACG ATGCAACTGC 

51 AAAAACAAAA AATGCTACCT TAACTTGGAC TAAAACAGGA TACAAGCCGA 

101 ATCCAGAACG TCAGGGACCT TTGGTTCCTA ATAGCCTGTG GGGTTCTTTT 

151 GTCGATGTCC GCTCCATTCA GAGC CTCATG GACCGGAGCA CAAGTTCGTT 

35 201 ATCTTCGTCA ACAAATTTGT GGGTATCAGG AATCGCGGAC TTTTTGCATG 

251 AAGATCAGAA AGGAAACCAA CGTAGTTATC GTCATTCTAG CGCGGGTTAT 

301 GCATTAGGAG GAGGATTCTT CACGGCTTCT GAAAATTTCT TTAATTTTGC 

351 TTTTTGTCAG CTTTTTGGCT ACGACAAGGA CCATCTTGTG GCTAAGAACC 

40! ATACCCATGT ATATGCAGGG GCAATGAGTT ACCGACACCT CGGAGAGTCT 

40 451 AAGACCCTCG CTAAGATTTT GTCAGGAAAT TCTGACTCCC TAC CTTTTGT 

501 CTTCAATGCT CGGTTTGCTT ATGGCCATAC CGACAATAAC ATGACCACAA 

551 AGTACACTGG CTATTCTCCT GTTAAGGGAA GCTGGGGAAA TGATGCCTTC 

601 GGTATAGAAT GTGGAGGAGC TATCCCGGTA GTTGCTTCAG GACGTCGGTC 

651 TTGGGTGGAT ACCCACACGC CATTTCTAAA CCTAGAGATG ATCTATGCAC 

45 701 ATCAGAATGA CTTTAAGGAA AACGGCACAG AAGGCCGTTC TTTCCAAAGT 

751 GAAGACCTCT TCAATCTAGC GGTTCCTGTA GGGATAAAAT TTGAGAAATT 

801 CTCCGATAAG TCTACGTATG ATCTCTCCAT AGCTTACGTT CCCGATGTGA 

851 TTCGTAATGA TCCAGGCTGC ACGACAACTC TTATGGTTTC TGGGGATTCT 

901 TGGTCGACAT GTGGTACAAG CTTGTCTAGA CAAGCTCTTC TTGTACGTGC 

50 951 TGGAAATCAT CATGCCTTTG CTTCAAACTT TGAAGTTTTC AGTCAGTTTG 

1001 AAGTCGAGTT GCGAGGTTCT TCTCGTAGCT ATGCTATCGA TCTTGGAGGA 

1051 AGATTCGGAT TTTAA 

This sequence is frame-shifted with respect to cp0016. 
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The PSORT algorithm predicts a cytoplasmic location (0.075). 

The protein was expressed in Kcoli and purified as a GST-fusion product, as shown in Figure 8 A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 8B) and for FACS analysis (Figure 8C). A his-tagged protein was also expressed. 

5 This protein also showed good cross-reactivity with human sera, including sera from patients with 
pneumonitis. 

These experiments show that cp0017 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 9 

10 The following C.pneumoniae protein (pid 6172315) was expressed <SEQ ID 17; cp0014>: 

1 MKSSFPKFVF STFAIFPLSM IATETVLDSS ASFDGNKNGW FSVRESQEDA 

51 GTTYLFKGNV TLENIPGTGT AITKSCF20NT KGDIjTFTGNG NSLLFQTVDA 

101 GTVAGAAVNS SWDKSTTFI GFSSLSFIAS PGSSITTGKG AVSCSTGSLS 

151 LTKMSVCSSA KTFQRIMAVL SPQKIjFH* 

15 The cp0014 nucleotide sequence <SEQ ID 18> is: 

1 ATGAAGTCTT CTTTCCCCAA GTTTGTATTT TCTACATTTG CTATTTTCCC 

51 TTTGTCTATG ATTGCTACCG AGACAGTTTT GGATTCAAGT GCGAGTOTCG 

101 ATGGGAATAA AAATGGTAAT TTTTCAGTTC GTGAGAGTCA GGAAGATGCT 

151 GGAACTACCT ACCTATTTAA GGGAAATGTC ACTCTAGAAA ATATTCCTGG 

^0 201 AACAGGCACA GCAATCACAA AAAGCTGTTT TAACAACACT AAGGGCGATT 

251 TGACTTTCAC AGGTAACGGG AACTCTCTAT TGTTCCAAAC GGTGGATGCA 

301 GGGACTGTAG CAGGGGCTGC TGTTAACAGC AGCGTGGTAG ATAAATCTAC 

351 CACGTTTATA GGGTTTTCTT CGCTATCTTT TATTGCGTCT CCTGGAAGTT 

401 CGATAACTAC CGGCAAAGGA GCCGTTAGCT GCTCTACGGG TAGCTTGAGT 

25 451 TTGACAAAAA TGTCAGTTTG CTCTTCAGCA AAAACTTTTC AACGGATAAT 

501 GGCGGTGCTA TCACCGCAAA, AACTCTTTCA TTAA 

This protein is frame-shifted with respect to cp0015. 

The PSORT algorithm predicts an inner membrane location (0.047). 

The protein was expressed in Kcoli and purified as a his-tag product, as shown in Figure 9A. A 
30 GST-fusion was also expressed. The recombinant proteins were used to immunise mice, whose sera 
were used in an immunoassay (Figure 9B) and for FACS analysis (Figure 9C). 

This protein also showed good cross-reactivity with human sera, including sera from patients with 
pneumonitis. 

These experiments suggest that cp0014 is a useful immunogen. These properties are not evident from 
35 the sequence alone. 

Example 10 

The following ^pneumoniae protein (pid 6172317) was expressed <SEQ ID 19; cp0015>: 

1 MSALFSENTS SKKGGAIQTS DALTITGNQG EVSFSHNTSS DSGAAIFTEA 

51 SVTISNNAKV SFIDNKVTGA SSSTTGDMSG GAICAYKTST DTKVTLTGNQ 

40 101 MLLFSNWTST TAGGAIYVKK LELASGGLTL FSRNSVNGGT APKGGAIAIE 

151 DSGELSLSAD SGDIVFLGNT VTSTTPGTNR SSIDLGTSAK MTATjRSAAGR 
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201 AIYFYDPITT GSSTTVTDVL KVNETPADSA LQYTGNIIFT GEKLSETEAA 

251 DSKNLTSKLL QPVTLSGGTL SLKHGVTLQT QAFTQQADSR LEMDVGTTLE 

301 PAOTSTINNL VINISSIDGA KKAKIETKAT SKNLTLSGTI TLLDPTGTFY 

351 ENHSIjKNPQS YDILELKASG TVTSTAVTPD PIMGEKFHYG yqgtwgpivw 

5 401 GTGAS TTATF NWTKTGYIPN PERIGSLVPN SLWNAFIDIS SLHYIoMETAN 

451 EGLQGDRAFW CAGLSNFFHK DSTKTRRGFR HIiSGGYVIGG NLHTCSDKIL 

501 SAAFCQLFGR DRDYFVAKNQ GTVYGGTLYY QHNETYISt»P CKbRPCSLSY 

551 VPTEIPVLFS GNLSYTHTDN DLKTKYTTYP TVKGSWGNDS FALEFGGRAP 

601 I CLDES AliFE QYMPFMKLQF VYAHQEGFKE QGTEAREFGS SRLVNLALPI 

10 651 GIRFDKESDC QDATYNLTLG YTVDLVRSNP DCTTTliRISG DSWKTFGTNIi 

701 ARQALVL.RAG NHFCFNSNFE AFSQFSFELR GSSRNYNVDL GAKYQF* 

This sequence is frame-shifted with respect to cp0014. 
The cp0015 nucleotide sequence <SEQ ID 20> is: 

1 ATGTCAGCTC TGTTTTCTGA AAATACCTCC TCAAAGAAAG GCGGAGCCAT 

15 51 TCAGACTTCC GATGCCCTTA CCATTACTGG AAACCAAGGG GAAGTCTCTT 

101 TTTCTGACAA TACTTCTTCG GATTCTGGAG CTGCAATTTT TACAGAAGCC 

151 TCGGTGACTA TTTCTAATAA TGCTAAAGTT TCCTTTATTG ACAATAAGGT 

201 CACAGGAGCG AGCTCCTCAA CAACGGGGGA TATGTCAGGA GGTGCTATCT 

251 GTGCTTATAA AACTAGTACA GATACTAAGG TCACCCTCAC TGGAAATCAG 

20 301 ATGTTACTCT TCAGC7ACAA TACATCGACA ACAGCGGGAG GAGCTATCTA 

351 TGTGAAAAAG CTCGAACTGG CTTCCGGAGG ACTTACCCTA TTCAGTAGAA 

401 ATAGTGTCAA TGGAGGTACA GCTCCTAAAG GTGGAGCCAT AGCTATCGAA 

451 GATAGTGGGG AATTGAGTTT ATCCGCCGAT AGTGGTGACA TTGTCTTTTT 

501 AGGGAATACA GTCACTTCTA CTACTCCTGG GACGAATAGA AGTAGTATCG 

25 551 ACTTAGGAAC GAGTGCAAAG ATGACAGCTT TGCGTTCTGC TGCTGGTAGA 

601 GCCATCTACT TCTATGATCC CATAACTACA GGATCATCCA CAACAGTTAC 

651 AGATGTCTTA AAAGTTAATG AGACTCCGGC AGATTCTGCA CTACAATATA 

701 CAGGGAACAT CATCTTCACA GGAGAAAAGT TATCAGAGAC AGAGGCCGCA 

751 GATTCTAAAA ATCTTACTTC GAAGCTACTA CAGCCTGTAA CTCTTTCAGG 

30 801 AGGTACTCTA TCTTTAAAAC ATGGAGTGAC TCTGCAGACT CAGGCATTCA 

851 CTCAACAGGC AGATTCTCGT CTCGAAATGG ACGTAGGAAC TACTCTAGAA 

901 CCTGCTGATA CTAGCACCAT AAACAATTTG GTCATTAACA TCAGTTCTAT 

951 AGACGGTGCA AAGAAGGCAA AAATAGAAAC CAAAGCTACG TCAAAAAATC 

1001 TGACTTTATC TCGAACCATC ACTTTATTGG ACCCGACGGG CACGTTTTAT 

35 1051 GAAAATCATA GTTTAAGAAA TCCTCAGTCC TACGACATCT TAGAGCTCAA 

1101 AGCTTCTGGA ACTGTAACAA GCACCGCAGT GACTCCAGAT CCTATAATGG 

1151 GTGAGAAATT CCATTACGGC TATCAGGGAA CTTGGGGCCC AATTGTTTGG 

1201 GGGACAGGGG CTTCTACGAC TGCAACCTTC AACTGGACTA AAACTGGCTA 

1251 TATTCCTAAT CCCGAGCGTA TCGGC TCTTT AGTCCCTAAT AGCOTATGGA 

40 1301 ATGCATTTAT AGATATTAGC TCTCTCCATT ATCTTATGGA GACTGCAAAC 

1351 GAAGGGTTGC AGGGAGACCG TGCTTTTTGG TGTGCTGGAT TATCTAACTT 

1401 CTTCCATAAG GATAGTACAA AAACACGACG CGGGTTTCGC CATTTGAGTG 

1451 GCGGTTATGT CATAGGAGGA AACCTACATA CTTGTTCAGA TAAGATTCTT 

1501 AGTGCTGCAT TTTGTCAGCT CTTTCGAAGA GATAGAGACT ACTTTGTAGC 

45 1551 TAAGAATCAA GGTACAGTCT ACGGAGGAAC TCTCTATTAC CAGCACAACG 

1601 AAACCTATAT CTCTCTTCCT TGCAAACTAC GGCCTTGTTC GTTGTCTTAT 

1651 GTTCCTACAG AGATTCCTGT TCTCTTTTCA GGAAACCTTA GCTACACCCA 

1701 TACGGATAAC GATCTGAAAA CCAAGTATAC AACATATCCT ACTGTTAAAG 

1751 GAAGCTGGGG GAATGATAGT TTCGCTTTAG AATTCGGTGG AAGAGCTCCG 

50 1801 ATTTGC TTAG ATGAAAGTGC TCTATTTGAG CAGTACATGC CCTTCATGAA 

1851 ATTGCAGTTT GTCTATGCAC ATCAGGAAGG TTTTAAAGAA CAGGGAACAG 

1901 AAGCTCGTGA ATTTGGAAGT AGCCGTCTTG TGAATCTTGC CTTACCTATC 

1951 GGGATCCGAT TTGATAAGGA ATCAGACTGC CAAGATGCAA CGTACAATCT 

2001 AACTCTTGGT TATACTGTGG ATCTTGTTCG TAGTAACCCC GACTGTACGA 

55 2051 CAACACTGCG AATTAGCGGT GATTCTTGGA AAACCTTCGG TACGAATTTG 

2101 GCAAGACAAG CTTTAGTCCT TCGTGCAGGG AACCATTTTT GCTTTAACTC 

2151 AAATTTTGAA GCCTTTAGCC AATTTTCTTT TGAATTGCGT GGGTCATCTC 

2201 GCAATTACAA TGTAGACTTA GGAGCAAAAT AC C AATTCTA A 

The PSORT algorithm predicts a cytoplasmic location (0.274). 

60 The protein was expressed in Kcoli and purified as a GST-fusion product, as shown in Figure 10A. 
The recombinant protein was used to immunise mice,- whose sera were used in a Western blot 
(Figure 10B) and for FACS analysis. A his-tagged protein was also expressed. 
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These experiments show that cpOOlS is a useful immunogen. These properties are not evident from 
the sequence alone. 

Example 11 

The following ^pneumoniae protein (pid 6172325) was expressed <SEQ ID 21; cp0019>: 

1 LQDSQDYSPV KLSPGAGGTI ITQDASQKPL 

51 PGTGTQPSQA NIjEWVRTGYL PNPERQGSIiV 

101 SSQILCQERG VWGAGIANFL HRDKINEHGY 

151 INAAFCQLFS RDKDYWSKN HGTSYSGWF 

201 ACCNQWTID MQLSYSHRNN DMKTKYTTYP 

251 YYPNSTFIiFD YYSPFLRLQC TYAHQEDFKE 

3 01 GVKFERFSDC KRGSYELTLA YVPDVIRKDP 

351 RQGLQLRLGN HCLINPGIEV FSHGAI ELRG 

This sequence is frame-shifted with respect to cp0018. 
The cp0019 nucleotide sequence <SEQ ID 22> is: 



l 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 



TTGCAAGACT 
AGGGACTATA 
CTTCTAGACC 
CCAGGAACGG 
AGGATACCTT 
TGTGGGGTTC 
AGTAGCCAAA 
TAATTTCCTA 
GTGTCGGTTA 
ATAAATGCGG 
ATCCAAAAAT 
CCCTAGAGTT 
GCTTGCTGTA 
TAGAAATAAT 
GATCTTGGGC 
TACTACCCTA 
GCTGCAGTGC 
AGGTTCGTCA 
GGCGTGAAGT 
TACCCTTGCT 
CAACATTGGC 
AGACAAGGAT 
AATTGAGGTG 
ATTATAACAT 



CTCAAGACTA 
ATTACTCAAG 
ACATTATGGC 
GAACTCAACC 
CCGAATCCCG 
TTTTGTTGAT 
TCTTATGTCA 
CATAGAGATA 
TCTTGTGGGA 
CTTTTTGCCA 

CATGGAACTA 
TAGAAGTCCA 
ACCAAGTCGT 
GATATGAAAA 
AAATGATGTT 
ACAGTACTTT 
ACCTATGCTC 
CTTTACTAG C 
TTGAGAGATT 
TATGTTCCTG 
TAGTGGAGCT 
TACAACTGCG 
TTCAGTCACG 
CAATCTCGGG 



TAGCTTTGTA 
ATGCTTCTCA 
TATCAAGGAC 
GAGCCAGGCA 
AACGGCAAGG 
CAGCGTGCTA 
GGAACGGGGA 
AAATTAATGA 
GTTGGCACTC 
GCTCTTCAGT 
GCTACTCAGG 
CAGGGATTCT 
CACTATAGAT 
CCAAATACAC 
TTTGGTCTTG 
TTTATTTGAT 
ACCAGGAAGA 
GGAGATCTTT 
TTCAGACTGT 
ATGTGATTCG 
ACGTGGAGCA 
TTTAGGGAAC 
GAGCTATTGA 
GGTAAATACC 



EVAPSRPHYG 
PNSLWGSFVD 
RHSGVGYLVG 
LEDTLEFRSP 
EAQGSWANDV 
TGGEVRHFTS 
KSTATLASGA 
SSRNYNINLG 



AAGTTATCTC 
GAAGCCTCTT 
ATTGGAATGT 
AATTTAGAAT 
ATCTTTAGTT 
TCCAAGAAAT 
GTCTGGGGAG 
GCACGGC TAT 
ATGCTTTTTC 
AGAGATAAAG 
GGTCGTATTT 
ATACTGATAG 
ATGCAGTTGT 
GACATATCCA 
AGTTTGGAGC 
TACTACTCTC 
CTTCAAAGAG 
TCAATTTAGC 
AAAAGGGGAT 
CAAAGATCCC 
CCCACGGAAA 
CACTGTCTCA 
ATTGCGGGGA 
GATTTTAA 



YQGHWNVQVI 
QRAIQEIMVN 
VGTHAFSDAT 
QGFYTDSSSE 
FGLEFGATTY 
GDLFNLAVPI 
TWSTHGNNLS 
GKYRF* 



CAGGAGCGGG 
GAAGTAGCTC 
GCAAGTCATC 
GGGTGCGGAC 
CCCAATAGCC 
CATGGTAAAT 
CTGGAATTGC 
CGCCATAGCG 
TGATGCTACG 
ACTACGTAGT 
CTTGAGGATA 
CTCCTCAGAA 
CTTACAGCCA 
GAAGCTCAGG 
GACTACATAC 
CGTTTCTCAG 
ACAGGAGGTG 
AGTTCCTATT 
CTTATGAACT 
AAGAGCACGG 
CAATCTCTCC 
TAAATCCTGG 
TCCTCTCGTA 



The PSORT algorithm predicts a cytoplasmic location (0-189). 

40 The protein was expressed in Kcoli and purified as a GST-fusion product, as shown in Figure 11 A. 
This protein was used to immunise mice, whose sera were used in a Western blot (Figure 1 IB) and 
an immunoblot assay (Figure 11C). A his-tagged protein was also expressed. 

These experiments show that cp0019 is a useful immunogen. These properties are not evident from 
the sequence alone. 



45 Example 12 

The following ^pneumoniae protein (pid 4376466) was expressed <SEQ ID 23; cp6466>: 

1 MRKISVGICl TJXLSILSWL QGCKESSHSS TSRGELAINI RDEPRSLDPR 

51 QVRLLSEISL VKHIYEGLVQ EWNLSGNIEP ALAEDYSLSS DGLTYTFKLK 

101 SAFWSWGDPL TAEDFIESWK QVATQEVSGI YAFALNPIKN VRKIQEGHLS 

50 151 IDHFGVHSPN ESTLWTLES PTSHFLKLLA LPVFFPVHKS QRTLQSKSLP 

201 IASGAFYPKN 1KQKQWIKLS KNPHYYNQSQ VETKTITIHF IPDANTAAKL 



BNSDOCID: <WO 0202606 A2_l_> 



WO 02/02606 



PCT/IB01/01445 



-52- 



251 
301 
351 
401 
451 
501 



FNQGKLNWQG 
LNtJMKIiEEAL 
AQRQAYAKKli 
KESLGFAIPI 
PSGVPPYAIN 
YHDAFQFAMN 



PPWGERIPQE 
ASALDKEALV 
FKEALEELQI 
VGKEFALLQA 
HKDFLEILQN 

KKIj snlgvs p 



TLSWIiQSKGH 
STIFLGRAKT 
TAKDLEHLNL 
DLSSGNFSLA 
IEQEQDHQKR 
TGWDFRYAK 



A predicted signal peptide is highlighted. 

The cp6466 nucleotide sequence <SEQ ID 24> is: 



10 



15 



20 



25 



30 



35 



40 



1 


ATGCGCAAGA 


51 


CGTAGTCCTC 


101 




151 




201 




251 


AAGACTACTr 


301 


■L VaAVV JL X J- J. X 


351 


jH.X til 


401 


CCTTGAATCC 


451 


ATAGACCATT 


501 


CCTGGAATCC 


551 


TTTTCCCCGT 


601 


ATAGCAAGCG 


651 


AAAACTCTCA 


701 


AAACGATTAC 


751 


TTTAATCAGG 


801 


TCCTCAAGAA 


851 


TTGATGTCGC 


901 


CTCAACAATA 


951 


AGCTCTTGTC 


1001 


TCCTACCTAC 


1051 


GCACAACGCC 


1101 


ACTCCAAATC 


1151 


TTTCCTCGTC 


1201 


AAAGAAAGTT 


1251 


TCTCCAAGCA 


13 01 


GGTTCGCAGA 


1351 


CCATCAGGAG 


1401 


TCTACAAAAC 


1451 


TGTCGCAAGC 


1501 


TACCACGACG 


1551 


AGTCTCACCA 



TATCAGTGGG 
CAAGGCTGCA 
TATTAATATA 
TTCTTTCAGA 
GAAAATAATC 
TCTTTCCTCG 
GGAGTAATGG 
CAAGTAGCTA 
AATTAAAAAT 
TTGGAGTGCA 
CCAACCTCGC 
TCATAAATCT 
GAOCTTTCTA 
AAAAACCCTC 
GATTCACTTC 
GAAAACTCAA 
ACCCTATCCA 
AGGAACCTCA 
TGAAGCTTAG 
TCAACTATAT 
AAATATTCAT 
AAGCTTACGC 
ACTGCTAAAG 
AGCAAGTTCT 
TAGGGTTCGC 
GACCTATCTT 
CTTTGCTGAT 
TTCCTCCTTA 
ATAGAACAAG 
TTCTCTTTAC 
CATTTCAATT 
ACAGGAGTTG 



AATCTGTATC 
AGGAGTCCAG 
AGAGATGAAC 
AATCAGCCTT 
TOTCAGGAAA 
GACGGACTCA 
CGACCCCTTA 
CTCAAGAAGT 
GTACGAAAGA 
CTCTCCTAAT 
ATTTCTTAAA 
CAAAGAACCC 
TCCTAAAAAT 
ACTACTATAA 
ATTCCCGATG 
TTGGCAAGGA 
ATTTACAGTC 
TGGCTCACCT 
AGAAGCCTTA 
TCTTAGGCCG 
AGCTATCCCG 
TAAAAAACTC 
ATC TCGAACA 
TTACTAGTCC 
TATCCCTATT 
CAGGGAACTT 
CCTATGGCAT 
TGCAATCAAC 
AGCAAGATCA 
CTAGAGACCT 
TGCTATGAAT 
TGGACTTCCG 



LHSFDVAGTS 
ADHLLPTNIH 
IFPVSSSASS 
TGGWFADFAD 
SELVSQASLY 
EN* 



ACCATTCTCC 
TCACTCCTCT 
CCCGTTCTTT 
GTCAAACATA 
TATAGAGCCT 
CTTATACTTT 
ACAGCTGAAG 
CTCAGGAATC 
TCCAAGAGGG 
GAATCTACAC 
ACTTTTAGCT 
TGCAATCCAA 
ATCAAACAAA 
TCAAAGTCAG 
CAAACACAGC 
CCTCCTTGGG 
TAAGGGGCAC 
TCAATATCAA 
GCATCAGCCT 
TGCAAAAACT 
AACATCAAAA 
TTTAAAGAAG 
TCTTAATCTT 
AACTTATACG 
GTCGGAAAGG 
CTCTTTAGCT 
TTCTAACGAT 
CATAAGGACT 
CCAAAAACGC 
TTCATATTAT 
AAAAAACTTT 
TTATGCrAAG 



WLTFtflNKFP 
SYPEHQKQEM 
LLVQUIREQW 
PMAFIiTIFAY 
LETFH2IEPI 



TTAGCCTCTC 
ACATCTCGGG 
AGATCCAAGA 
TCTATGAGGG 
GCTCTTGCAG 
TAAACTGAAA 
ACTTFATAGA 
TATGCTTTTG 
ACACCTCTCC 
TTGTTGTTAC 
CTTCCAGTCT 
ATCTCTACCT 
AACAATGGAT 
GTGGAAACTA 
AGCAAAACTA 
GAGAACGCAT 
TTACACTCTT 
TAAATTCCCC 
TAGATAAGGA 
GCCGATCATC 
ACAAGAGATG 
CTTTAGAAGA 
ATCTTTCCCG 
AGAACAGTGG 
AATTTGCTCT 
ACAGGAGGAT 
CTTTGCTTAT 
TCCTAGAAAT 
TCGGAATTAG 
TGAGCCGATC 
CTAATCTAGG 
GAAAATTAG 



45 



The PSORT algorithm predicts that the protein is an outer membrane lipoprotein (0.790). 

The protein was expressed in Rcoli and purified both as a GST-fusion product and a His-tag fusion 
product. Purification of the protein as a GST-fusion product is shown in Figure 12 A. The 
recombinant proteins were used to immunise mice, whose sera were used in Western blots (Figures 
12B and 12C). FACS analysis was also performed. 

These experiments show that cp6466 is a useful immunogen. These properties are not evident from 
the sequence alone. 



Example 13 

The following C.pneumoniae protein (pid 43 7 6468) was expressed <SEQ ID 25; cp6468>: 

50 1 MFSRWITIiFIi LFXSLTG CSS YSSKHKQSkl IPIHDDPVAF SPEQAKRAMD 

51 IiSIAQLLFDG LTRETHKESN DLELAIASRY TVSEDFCSYT FFIKDSALWS 

101 JDGTPITSEDI RNAWEYAQEN SPHIQIFQGL NFSTPSSNAI TIHLDSPNFD 

151 FPKLLAFPAF AIFKPENPKL FSGPYTLVEY FPGHNIHIjKK NPNYYDYHCV 

201 SIWSIKLLII PDIYTAIHLL NRGKVDWVGQ PWHQGIPWEL HKQSQYHYYT 

'55 251 YPVEGAFWLC LNTKSPHLND LQNRHRLATC IDKRSIIEEA LQGTQQPAET 



BNSOOCID: <WO 0202606A2J„> 



WO 02/02606 



-53- 



PCT/IB01/01445 



301 LSRGAPQPNQ YKKQKPLTPQ EKLVLTYPSD ILRCQRIAEI LKEQWKAAGI 
351 PLILBGLEYH LFVNKRKVQD YAIATQTGVA YYPGANIjISE EDKLLQWFEI 
401 IPIYYLSYDY LTQDFIEGVI YNASGAVDIjK YTYFP* 

A predicted signal peptide is highlighted. 
5 The cp6468 nucleotide sequence <SEQ ID 26> is: 

1 ATGTTTTCAC GATGGATCAC CCTCTTTTTA TTATTCATTA GCCTTACTGG 

51 ATGCTCCTCC TACTCTTCAA AACATAAACA ATCTTTAATT ATTCCCATAC 

101 ATGACGACCC TGTAGCTTTT TCTCCTGAAC AAGCAAAACG GGCCATGGAC 

151 CTTTCTATTG CCCAACTTCT TTTTGATGGT CTGACTAGAG AAACTCATCG 

10 201 CGAATCCAAT GATTTGGAAT TAGCGATTGC CAGTCGCTAT ACAGTCTCTG 

251 AAGACTTTTG CTCTTATACG TTCTTTATCA AAGACAGCGC TTTATGGAGC 

3 01 GACGGAACAC CAATCACCTC CGAAGATATC CGTAACGCTT GGGAGTATGC 

351 ACAGGAGAAC TCTCCCCACA TACAGATCTT CCAAGGACTT AACTTCTCAA 

401 CTCCTTCATC AAATGCAATT ACGATTCATC TCGACTCGCC CAACCCCGAT 

15 451 TTTCCTAAGC TTCTTGCCTT TCCTGCATTT GCTATCTTTA AACCAGAAAA 

501 CCCGAAGCTC TTTAGCGGTC CGTATACTCT TGTAGAGTAT TTCCCAGGGC 

551 ATAACATTCA TTTAAAGAAA AACCCTAACT ATTACGACTA CCACTGCGTC 

601 TCCATCAACT CCATCAAACT GCTCATTATT CCTGATATAT ATACAGCCAT 

651 CCACCTCCTA AACAGAGGCA AGGTGGACTG GGTAGGACAA CCCTGGCATC 

20 701 AAGGGATTCC OTGGGAGCTC CATAAACAAT CGCAATATCA CTACTACACC 

751 TATCCTGTAG AAGGTGCCTT CTGGCTTTGT CTAAATACAA AATCCCCACA 

801 CTTAAATGAT CTTCAAAACA GACATAGACT CGCTACTTGT ATTGATAAAC 

851 GTTCTATCAT TGAAGAAGCT CTTCAAGGAA CCCAACAACC AGCGGAAACA 

901 CTGTCCCGAG GAGCTCCACA ACCAAATCAA TATAAAAAAC AAAAGCCTCT 

25 951 AACTCCACAA GAAAAACTCG TGCTTACCTA TCCCTCAGAT ATTCTAAGAT 

1001 GCCAACGCAT AGCAGAAATC TTAAAGGAAC AATGGAAAGC TGCTGGAATA 

1051 GATTTAATCC TTGAAGGACT CGAATACCAT CTGTTTGTTA ACAAACGAAA 

1101 AGTCCAAGAC TACGCCATAG CAACACAGAC TGGAGTTGCT TATTACCCAG 

1151 GAGCAAATCT AATTTCTGAA GAAGACAAGC TCCTGCAAAA CTTTGAGATT 

30 1201 ATCCCGATCT ACTATCTGAG CTATGACTAT CTCACTCAAG ATTTTATAGA 

1251 GGGAGTAATC TATAATGCTT CTGGAGCTGT AGATCTCAAA TATACCTATT 

1301 TCCCCTAG 

The PSORT algorithm predicts that this protein is an outer membrane lipoprotein (0.790). 

The protein was expressed in Kcoli and purified as a GST-fusion product, as shown in Figure 13A. 
35 The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 13B) and for FACS analysis. A his-tagged protein was also expressed. 

These experiments show that cp6468 is a useful immunogen. These properties are not evident from 
the sequence alone. 

Example 14 

40 The following C.pneumoniae protein (pro 4376469) was expressed <SEQ ID 27; cp6469>: 

1 MKMHRLKPTL KSLIPNLLFL LLTLSS CSKQ KQEPLGKHLV IAMSHDLADL 

51 DPRNAYLiSRD ASLAKALYEG LTRETDQGXA LALAESYTLS KDHKVYTFKL 

101 RPSVWSDGTP LTAYDFEKSI KQLYFEEFSP SIHTLLGVIK NSSAIHNAQK 

151 SLETLGIQAK DDLTLVITLE QPFPYFLTLI ARPVFSPVHH TLRESYKKGT 

45 201 PPSTYISNGP FVLKKHEHQN YLILEKNPHY YDHESVKLDR VTLKIIPDAS 

251 TATKLFKSKS IDWIGSPWSA PISNEDQKVL SQEKILTYSV SSTTOjLIYNIj 

301 QKPIjIQNKAL RKAIAHAIDR KSILRLVPSG QEAVTLVPPN LSQLNLQKEI 

351 STEERQTKAR AYFQEAKETL SEKELAELSl LYPIDSSNSS IIAQEXQRQL 

401 KDTLGLKIKI QGMEYHCFLK KRRQGDFFIA TGGWIAEYVS PVAFX, S ILGN 

50 451 PRDLTQWRNS DYEKTLEKLY LPHAYKENKK RAEMIIEEET PIIPI/YHGKY 

501 IYAIHPKIQN TFGSLLGHTD LKWIDlIiS* 



A predicted signal peptide is highlighted. 

The cp6469 nucleotide sequence <SEQ ID 28> is: 



BNSDOCID: <WO 0202606A2J„> 



WO 02/02606 



-54- 



PCT/IB01/01445 



1 ATGAAGATGC ATAGGCTTAA ACCTACCTTA AftAAGTCTGA TCCCTAATCT 

51 TCTTTTCTTA TTGCTCACTC TTTCAAGCTG CTCAAAGCAA AAACAAGAAC 

101 CCTTAGGAAA ACATCTCGTT ATTGCGATGA GCCATGATCT CGCCGACCTA 

151 GATCC TCGCA ATGCCTATTT AAGCAGAGAT GCTTCCCTAG CAAAAGCCCT 

201 CTATGAAGGA CTGACAAGAG AAACTGATCA AGGAATCGCA CTGGCTCTTG 

251 CAGAAAGTTA TACCCTGTCA AAAGATCATA AGGTCTATAC CTTTAAACTC 

301 AGACCTTCTG TGTGGAGCGA TGGCACTCCA CTCACTGCTT ATGACTTTGA 

351 AAAATC TATA AAACAACTGT ACTTCGAAGA ATTTTCACCT TCCATACATA 

in 401 CTTTACTCGG CGTGATTAAA AATTCTTCGG CAATCCACAA TGCTCAAAAA 

iU 451 TCTCTGGAAA CTCTTGGGAT ACAGGCAAAA GATGATCTTA CTTTGGTGAT 

501 TAC CCTAGAG CAACCTTTCC CATACTTTCT CACACTTATC GCTCGCCCCG 

551 TATTCTCCCC TGTTCATCAC ACCCTTAGGG AATCCTATAA GAAAGGAACA 

601 CCCCCATCCA CATACATCTC CAATGGGCCC TTTGTCTTAA AAAAACATGA 

651 AC AC C AAAAC TACTTAATTT TAGAAAAAAA TCCTCACTAC TATGATCATG 

701 AATCAGTAAA GTTAGACCGA GTC AC CTTAA AAATTATCCC AGACGCCTCC 

751 ACAGCCACGA AACTTTTCAA AAGTAAATCT ATAGATTGGA TTGGCTCACC 

801 TTGGAGCGCT CCGATATCTA ACGAAGACCA AAAAGTTCTC TCCCAAGAAA 

851 AGATTCTTAC CTATTCTGTT TCAAGCACCA CCCTTCTTAT CTATAACCTG 

on 901 CAAAAACCTC TAATACAAAA TAAAGCCCTC AGGAAAGCCA TTGCTCATGC 

Z{) 951 TATTGATAGA AAATCTATCT TAAGACTCGT GCCTTCAGGA CAAGAAGCTG 

1001 TAACTCTAGT TCCCCCAAAT CTTTCACAAC TCAATCTTCA AAAAGAGATC 

1051 TCAACAGAAG AACGACAAAC AAAAGCCAGA GCATATTTTC AAGAAGCTAA 

1101 AGAAACACTT TCTGAAAAAG AACTCGCAGA ACTCAGCATC CTCTATCCTA 

0 1151 TAGATTCCTC GAATTCCTCC ATCATAGCTC AAGAAATCCA AAGACAACTT 

^ 1201 AAAGATACCT TAGGATTGAA AATCAAAATC CAAGGCATGG AGTAC CACTG 

1251 CTTTTTAAAG AAACGTCGTC AAGGAGATTT CTTCATAGCG ACAGGAGGAT 

1301 GGATTGCGGA ATACGTAAGC CCCGTAGCCT TCCTATCTAT TCTAGGCAAC 

1351 CCCAGAGACC TCACACAATG GAGAAACAGT GATTACGAAA AGACTTTAGA 

1401 GAAACTCTAT CTCCCTCATG CCTACAAAGA GAATTTAAAA CGCGCAGAAA 

1451 TGATAATAGA AGAAGAAACC CCGATTATCC CCCTGTATCA CGGCAAATAT 

1501 ATTTACGCTA TACATCCTAA AATCCAGAAT ACATTCGGAT CTCTTCTAGG 

1551 CCACACAGAT CTCAAAAATA TCGATATCTT AAGTTAG 

The PSORT algorithm predicts a periplasmic location (0.934). 

The protein was expressed in Kcoli and purified as a GST-fusion product, as shown in Figure 14A. 
35 The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 14B) and for FACS analysis. A his-tagged protein was also expressed. 

These experiments show that cp6469 is a useful immunogen. These properties are not evident from 
the sequence alone. 

Example 15 

40 The following ^pneumoniae protein (pid 43 76602) was expressed <SEQ ED 29; cp6602>; 

1 MAASGGTGGL GGTQGVNLAA VEAAAAKADA AEWASQEGS EMNMIQQSQD 

51 LTNPAAATRT KKKEEKFQTL ESRKKGEAGK AEKKSESTEE KPDTDLADKY 

101 ASGNSEISGQ ELRGLRDAIG DDASPEDILA LVQEKIKDPA LQSTALDYLV 

151 QTTPPSQGKL KEALIQARNT HTEQFGRTAI GAKNILFASQ EYADQTJWSP 

45 201 SGLRSLYLEV TGDTHTCDQL LSMLQDRYTY QDMAIVSSFL MKGMATELKR 

251 QGPYVPSAQXj QVkMTETRNL QAVLTSYDYF ESRVPILLDS LKAEGIQTPS 

301 DLNFVKVAES YHKTINDKFP TASKVEREVR NLIGDDVDSV TGVUsHjFFSA 

351 LRQTSSRLFS SADKRQQLGA MIANALDAVN INNEDYPKAS DFPKPYPWS* 

The cp6602 nucleotide sequence <SEQ ID 30> is: 

50 1 ATGGCAGCAT CAGGAGGCAC AGGTGGTTTA GGAGGCACTC AGGGTGTCAA 

51 CCTTGCAGCT GTAGAAGCTG CAGCTGCAAA AGCAGATGCA GCAGAAGTTG 

101 TAGCCAGCCA AGAAGGTTCT GAGATGAACA TGATTCAACA ATCTCAGGAC 

151 CTGACAAATC CCGCAGCAGC AACACGCACG AAAAAAAAGG AAGAGAAGTT 

201 TCAAACTCTA GAATCTCGGA A&AAAGGAGA AGCTGGAAAG GCTGAGAAAA 

55 251 AATCTGAATC TACAGAAGAG AAGCCTGACA CAGATCTTGC TGATAAGTAT 

301 GCTTCTGGGA ATTCTGAAAT CTCTGGTCAA GAACTTCGCG GCCTGCGTGA 

351 TGCAATAGGA GACGATGCTT CTCCAGAAGA CATTCTTGCT CTTGTACAAG 



BNSDOCID: <WO 0202606A2j_> 



WO 02/02606 



PCT/IB01/01445 



-55- 



10 



15 



20 



30 



35 



40 



45 



50 



401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 



AGAAAATTAA AGACCCAGCT CTGCAATCCA CAGCTTTGGA 
CAAACGACTC CACCCTCCCA AGGTAAATTA AAAGAAGCGC 
AAGGAATACT CATACGGAGC AATTCGGACG AACTGCTATT 
ACATCTTATT TGCCTCTCAA GAATATGCAG ACCAACTGAA 
TCAGGGCTTC GCTCTTTGTA CTTAGAAGTG ACTGGAGACA 
TGATCAGCTA CTTTCTATGC TTCAAGACCG CTATACCTAC 
CTATTGTCAG CTCCTTTCTA ATGAAAGGAA TGGCAACAGA 

TGCGCAACTA CAAGTTCTCA 
TTACCTCGTA CGATTACTTT 
TTAAAAGCTG AGGGAATCCA 
AGCTGAGTCC TACCATAAAA 
AAGTAGAACG AGAAGTCCGC 
ACCGGTGTCT TGAACTTATT 
CCTTTTCTCT TCAGCAGACA 



CTACCTGGTT 
TTATCCAAGC 
GGTGCGAAAA 
TGTTTCTCCT 
CACATACCTG 
CAAGATATGG 
ATTAAAAAGG 
TGACAGAAAC 
GAAAGTCGCG 
AACTCCTTCT 
TCATTAACGA 
AATCTCATAG 
CTTTTCTGCT 
AACGTCAGCA 
ATAAACAATG 
TTGGTCATGA 



CAGGGTCCCT ACGTACCCAG 
TCGTAACCTG CAAGCAGTTC 
TTCCTATTTT ACTCGATAGC 
GATCTAAACT TTGTGAAGGT 
TAAGTTCCCA ACAGCATCTA 
GAGACGATGT TGATTCTGTG 
TTACGTCAAA CGTCGTCACG 

ATTAGGAGCT ATGATTGCTA ATGCTTTAGA TGCTGTAAAT 
AAGATTATCC CAAAGCATCA GACTTCCCTA AACCCTATCC 

The PSORT algorithm predicts a cytoplasmic location (0.080). 

The protein was expressed in Kcoli and purified as both a His-tag and a GST-fusion product, as 
shown in Figure 15 A. The recombinant proteins were used to immunise mice, whose sera were used 
in a Western blot (Figure 15B) and tcr FACS analysis (Figure 15C). 

The cp6602 protein was also identified in the 2D-PAGE experiment (Cpn0324), 

These experiments show that cp6602 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 16 

25 The following C.pneumoniae protein (pid 4376727) was expressed <SEQ ID 31; cp6727>: 



1 


MKYSLPWLI/T 


51 


DASGTTYTIiT 


101 


ALTHDGAAIN 


151 


ATFTDNASVT 


201 


LCSTAOTTVQ 


251 


NTAKTGGAWS 


301 


LATATDKTGL 


351 


TATAGCGGAI 


401 


TNLLFSGNKA 


451 


SLTSNAATVS 


501 


TGSTGTVTFS 


551 


QEGCGGAILS 


601 


ALHGNTTLTF 


651 


LHTKGNTSFT 


701 


KSLTI/TENES 


751 


AIYSKNLSIT 


801 


RATEGTSTPN 


851 


LVINPWKAI 


901 


ASIPANTTTI 


951 


LETTTTNNTD 


1001 


HNNEGSFYDN 


1051 


QGSWTLVPKV 


1101 


SIQQEIATAM 


1151 


SMTTPQEYTF 


1201 


SLRRHVLSKV 


1251 


SHSFAVEVGG 


1301 


DASHLVNVSI 


1351 


GTSWSTFATN 


1401 


CGTRYSF* 



SSALVF SLHP 

SDVSITWSA 
NTNTALSFSG 
LQKNTSEKDG 
GNSGTVTFSS 
SDDHLiALTGN 
AISQNQEMSF 
YTETEDFSUC 
TGPSNSSANQ 
GGAIYATKCT 
TKfTAKTGGAL 
FLESASVSTK 
DGNTAETAGG 
KNKALVFSGN 
LSFINNTAKR 
ANGPVSFTNN 
SIHLGAGAKI 
VPPPQPKNGP 
LNQKI NLAGG 
GSIDLKNLSV 
PGLKANLNLP 
GAGGKVTIiVA 
SDAPSHPGIW 
AVAFSQLFGK 
LPELPGETPL 
SL PVDLNYRY 
PMGL.TFKHES 
LSRQAFFAEA 



IjMAANTDLSS 
ITPADKSCFT 
FSSLLIDSAP 
AAVSAYSIDL 
NTATDKGGGI 
TQVLFQENKT 
TSNTTTANGG 
GSTGTVTFST 
EGCGGAILAF 
LTGNGSLTFD 
YSKGNNSLSG 
KGLWIEDNEN 
AIYTETEDFT 
SATATATTTT 
SGGGIYAPKC 
SGGKGGAIYI 
TKLAAAPGHT 
XASVPWPVA 
NWLKEGATL 
NLDALDGKRM 
FLDLSSTSGT 
EWQALGYTPK 
IGGIGNAFHQ 
SKDYWSDIK 
VLHGQVSYGR 
LTSYSPYVKL 
AKPPSALLIiT 
SGHLiKLLHGL 



SDNYENGSSG 
NTGGALSFVG 
ATGTSGGKGA 
AKTTTAALLD 
YSKEKDSTLD 
TOSAAQANKP 
AIYATKCTLD 
NTAKTGGALY 
IDSGSVSDKT 
GNTAGT SGGA 
NTNLItFSGNK 
VSLSGNTATV 
LTGSTGTVTF 
DQEGCGGAIL 
VISGSESINF 
ADSGELSLEA 
IYFYDPITME 
PANPNTGTIV 
QVYSFTQQPD 
ITIAVNSTSG 
VNIiDDFNPlP 
PELRATLVPKT 
DKQKENAGFR 
SQVYAGSLCA 
NHHSJMTTKLA 
QWSVNQKGF 

lgyavdayrd 
ix:fasgscel 



SAAFTAKETS 
ADHSLVLQTI 
ICVTNTEGGT 
QNTSTKNGGA 
ANTGWTFKS 
EGCGGAXCCY 
GOTTLTFDQN 
SKGNS SLTGN 
GtiSIANNQEV 
IYTETEDFTL 
ATGPSNSSAN 
SGGAIYATKC 
STNTAKTAGA 
CttlSESDIAT 
DGNTAETSGG 
IDGDITFSGN 
APASGGT IEE 
FSSGKLPSQD 
STVFMDAGTT 
GLKXSGDLKF 
SSMAAPDYGY 
SLWNAYVNIH 
LISRGYIVGG 
QSSYVIPLHS 
NNTQGKSDWD 
QEVAADPRIF 
HPHCLTSLTN 
RSSSRSYNAN 



55 A predicted signal peptide is highlighted. 
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The cp6727 nucleotide sequence <SEQ ID 32> is: 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



l 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 
2701 
2751 
2801 
2851 
2901 
2951 
3001 
3051 
3101 
3151 
3201 
3251 



ATGAAATATT 
CCTACATCCA 
ATGAAAATGG 
GATGCTTCAG 
TGTATCTGCA 
GAGCATTGAG 
GCGCTTACGC 
TTTCTCAGGA 
CTTCGGGCGG 
GCGACTTTTA 
AAAAGATGGA 
CGACAGCAGC 
CTCTGTAGTA 
CTTCTCCTCA 

AAAAGGATAG 
AATACTGCAA 
TACCGGCAAC 
CAGCACAGGC 
CTTGCTACAG 
AATGAGCTTC 
CTACTAAATG 
ACTGCGACAG 
TTCTCTTAAG 
AGACAGGCGG 
ACCAACCTGC 
AGCAAATCAA 
GATCCGTAAG 
AGCCTCACTA 
CAAATGTACT 
CTGGAACTTC 
ACAGGAAGTA 
CGGCGCCTTA 
TGCTCTTTTC 
CAAGAGGGTT 
AAGTACTAAA 
CTGGTAATAC 
GCTCTGCATG 
TGCAGGAGGA 
GTACGGGAAC 
CTACATACTA 
TTCTGGAAAT 
GTTGTGGTGG 
AAAAGCTTAA 
GGCAAAAAGA 
GCAGTGAATC 
GCGATTTATT 
TACCAATAAT 
GAGAACTTTC 
CGAGCGACTG 
GGCTAAGATC 
ATGATCCTAT 
TTAGTCATCA 
AAATGGTCCT 
CAAACACGGG 
GCCTCGATTC 
AGCAGGAGGA 
CCTTCACACA 
TTAGAGAGCA 
TCTCTCTGTA 
CCGTAAACAG 
CATAACAATG 
AAATCTTCCT 
ACGACTTCAA 
CAAGGGAGTT 
TTTGGTCGCG 
GTGCGACTTT 



CTTTACCTTG 
CTAATGGCTG 
TAGTAGTGGT 
GAACTACCTA 
ATTACTCCTG 
TTTTGTTGGA 
ATGATGGTGC 
TTCTCGTCAC 
CAAGGGTGCT 
CTGACAATGC 
GCTGCAGTTT 
TCTCTTAGAT 
CAGCAAACAC 
AATACTGCTA 
CACGCTAGAT 
AGACGGGGGG 
ACTCAAGTAC 
AAATAACCCG 
CAACAGACAA 
ACTAGTAATA 
TACTCTGGAT 
CAGGATGTGG 
GGAAGTACGG 
CGCCTTATAT 
TCTTTTCAGG 
GAGGGTTGCG 
CGATAAAACA 
GTAATGCTGC 
CTAACTGGAA 
AGGAGGGGCG 
CAGGAACCGT 
TATTCTAAAG 
AGGGAACAAA 
GCGGTGGGGC 
AAAGGACTCT 
TGCAACAGTA 
GAAACACGAC 
GCGATCTATA 
CGTGACCTTC 
AAGGAAATAC 
TCAGCAACAG 
AGCGATCCTC 
CTCTTACTGA 
AGTGGTGGTG 
CATAAACTTT 
CGAAAAACCT 
TCTGGAGGCA 
CTTAGAGGCT 
AGGGAACTTC 
ACTAAGCTTG 
TACGATGGAA 
ATCCTGTTGT 
ATAGC TTC AG 
AACTATAGTA 
CTGCAAATAC 
AATGTCGTTT 
GCAGCCTGAT 
CGACAACTAA 
AATCTGGATG 
CACAAGTGGG 
AAGGAAGTTT 
TTCTTAGATC 
TCCGATTCCT 
GGACTCTGGT 
GAATGGCAAG 
AGTTCCTAAT 



GCTACTTACC 
CTAACACGGA 
AGCGCAGCAT 
CACTCTCACT 
CAGATAAAAG 
GCTGATCACT 
TGCAATTAAC 
TCTTAATCGA 
ATTTGTGTGA 
CAGTGTCACC 
CTGCCTACAG 
CAAAATACTA 
TACAGTCCAA 
CAGATAAAGG 
GCCAATACAG 
TGCTTGGAGC 
TTTTTCAGGA 
GAAGGTTGTG 
AACTGGATTA 
CAACAACTGC 
GG AAACACAA 
CGGAGCTATC 
GAAC CGTGAC 
TCTAAAGGAA 
GAACAAAGCT 
GTGGGGCAAT 
GGACTATCGA 
AACAGTAAGT 
ACGGCTCCCT 
ATCTATACAG 
GACCTTCAGC 
GCAACAACTC 
GCTACGGGCC 
AATCCTATCG 
GGATTGAAGA 
AGTGGCGGTG 
TCTTACCTTT 
CAGAAACCGA 
AGCACAAATA 
TTCCTTTACC 
CAACAGCAAC 
TGTAATATCT 
AAATGAGAGT 
GTATTTATGC 
GATGGCAATA 
TTCGATTACA 
AGGGAGGCGC 
ATTGATGGGG 
AACTCCCAAC 
CAGCAGCTCC 
GCTCCTGCAT 
CAAAGCTATT 
TGCCTGTAGT 
TTTTCTTCTG 
TACCACCATA 
TAAAAGAAGG 
TCTACAGTAT 
CAATACAGAT 
CTTTAGATGG 
G G ATT AAAAA 
CTATGACAAT 
TTTCTTCTAC 
TCTAGCATGG 
TCCTAAAGTA 
CGTTAGGATA 
AGCCTTTGGA 



TCTTCGGCTT 
TCTCTCATCA 
TCACTGCCAA 
AGCGATGTTT 
CTGTTTTACA 
CATTGGTTCT 
AATACCAACA 
CTCAGCTCCA 
CAAATACAGA 
CTCCAAAAAA 
CATCGATCTT 
GCACAAAAAA 
GGAAACTCAG 
TGGGGGGATC 
GAGTCGTTAC 
TCTGATGACA 
AAATAAAACA 
GTGGGGCAAT 
GCCATTTCTC 
GAATGGTGGA 
CTCTTACCTT 
TATACAGAAA 
CTTCAGCACA 
ACAGCTCGCT 
ACGGGCCCGA 
CCTAGCCTTT 
TTGCAAACAA 
GGTGGTGCGA 
GACCTTTGAC 
AAACTGAAGA 
ACAAATACAG 
TCTGTCTGGT 
CGAGTAATTC 
TTTCTTGAGT 
TAACGAAAAC 
CGATCTATGC 
GATGGCAATA 
AGATTTTACT 
CAGCAAAGAC 
AAAAATAAGG 
AACAACTACA 
CAGAGTCTGA 
TTAAGTTTCA 
TCCTAAGTGT 
CTGCTGAAAC 
GCTAACGGTC 
CATTTATATA 
ATATTACTTT 
TCGATCCATT 
TGGTCATACG 
CTGGAGGAAC 
GTTCCTCCTC 
CCCTGTAGCA 
GAAAACTCCC 
CTGAACCAGA 
AGCCACCCTA 
TCATGGATGC 
GGCAGCATCG 
CAAGCGTATG 
TCTCAGGGGA 
CCTGGGTTGA 
TTCAGGAACT 
CTGCTCCGGA 
GGAGCTGGAG 
CACTCCTAAA 
ATGCTTATGT 



TAGTTTTCTC 
TCCGATAACT 
GGAAACTTCG 
CTATTACGAA 
AACACAGGAG 
GCAAACCATA 
CAGCTCTTTC 
GCAACAGGAA 
GGGAGGTACT 
ATACTTCAGA 
GCTAAGACTA 
TGGCGGGGCC 
GAACGGTGAC 
TACTCAAAAG 
COTCAAATCT 
ATCTTGCTCT 
ACCGGCTCAG 
CTGTTGTTAT 
AGAATCAAGA 
GCGATCTACG 
CGATCAGAAT 
CTGAAGATTT 
AATACAGCAA 
GACTGGAAAT 
GTAATTCTTC 
ATTGATTCAG 
CCAAGAAGTC 
TCTATGCTAC 
GGCAATACTG 
TTTTACTCTT 
CAAAGACAGG 
AATACCAACC 
TTCAGCAAAT 
CAGCATCTGT 
GTGAGTCTCT 
GACCAAGTGT 
CTGCCGAAAC 
CTTACGGGAA 
AGCAGGGGCT 
CTCTTGTATT 
GATCAAGAAG 
CATAGCTACA 
TTAACAATAC 
GTAATCTCAG 
TTCGGGAGGA 
CTGTCTCCTT 
GCCGATAGCG 
CTCAGGGAAC 
TAGGTGCAGG 
ATTTATTTTT 
AATAGAGGAG 
CCCAACCAAA 
CCTGCAAACC 
CAGTCAAGAT 
AGATCAACTT 
CAAGTATATT 
AGGAACGACC 
ATCTAAAGAA 
ATAACGATTG 
TCTGAAATTC 
AAGCAAACTT 
GTAAATTTAG 
TTATGGGTAT 
GGAAGGTGAC 
CCAGAGCTTC 
AAACATCCAT 



BNSDOC1D: <WO 0202606A2J_> 



WO 02/02606 



PCT/IB01/01445 



-57- 

33 01 TCTATACAGC AGGAGATCGC CACTGCGATG TCGGACGCTC CCTCACATCC 

3351 AGGGATTTGG ATTGGAGGTA TTGGCAACGC CTTCCATCAA GACAAGCAAA 

3401 AGGAAAATGC AGGATTCCGT TTGATTTCCA GAGGTTATAT TQTTGGTGGC 

3451 AGCATGACCA CCCCTCAAGA ATATACCTTT GCTGTTGCAT TCAGCCAACT 

5 3501 CTTTGGCAAA TCTAAGGATT ACGTAGTCTC GGATATTAAA TCTCAAGTCT 

3551 ATGCAGGATC TCTCTGTGCT CAGAGCTCTT ATGTCATTCC CCTGCATAGC 

3601 TCATTACGTC GCCACGTCCT CTCTAAGGTC CTTCCAGAGC TCCCAGGAGA 

3651 AACTCCCCTT GTTCTCCATG GTCAAGTTTC CTATGGAAGA AACCACCATA 

37 01 ATATGACGAC AAAGCTTGCG AACAACAGAC AAGGGAAATC AGACTGGGAC 

10 3751 AGCCATAGCT TCGCTGTTGA AGTCGGTGGT TCTCTTCCTG TAGATCTAAA 

3801 C TAG AG AT AC CTTACCAGCT ACTCTCCCTA TGTGAAACTC CAAGTTGTGA 

3851 GTGTAAATCA AAAAGGATTC CAAGAGGTTG CTGCTGATCC ACGTATCTTT 

3901 GACGCTAGCC ATCTGGTCAA CGTGTCTATC CCTATGGGAC TCACCTTCAA 

3951 ACACGAATCA GCAAAGCCCC CCAGTGCTTT GCTTCTTACT TTAGGTTACG 

15 4001 CTGTAGATGC TTACCGGGAT CACCCTCACT GCCTGACCTC CTTAACAAAT 

4051 GGCACCTCGT GGTCTACGTT TGCTACAAAC TTATCACGAC AAGCTTTCTT 

4101 TGCTGAGGCT TCTGGACATC TGAAGTTACT TCATGGTCTT GACTGCTTCG 

4151 CTTCTGGAAG TTGTGAACTG CGCAGCTCCT CAAGAAGCTA TAATGCAAAC 

42 01 TGTGGAACTC GTTATTCTTT CTAA 

20 The PSORT algorithm predicts an outer membrane location (0.915). 

The protein was expressed in E.coli and purified as a his-tag product, as shown in Figure 16 A. The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
16B) and for FACS analysis (Figure 16C). A GST-fusion protein was also expressed. 

The cp6727 protein was also identified in the 2D-PAGE experiment (Cpn0444). 

25 These experiments show that cp6727 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 17 

The following ^pneumoniae protein (pid 43 76731) was expressed <SEQ ID 33; cp6731>: 

1 MKSSLHWFLI SSSIiALPLSL NFSAFAA WE INLGPTNSFS GPGTYTPPAQ 

30 51 TTNADGT I YN LTGDVSITNA GSPTALTASC FKETTGNLSF QGHGYQFLLQ 

101 NIDAGANCTF TNTAANKLLS FSGFSYLSIil QTTNATTGTG AIKSTGACSI 

151 QSNYSCYFGQ NFSMDNGGAL QGSSISLSIiN PNLTFAKNKA TQKGGALYST 

201 GGITINNTLN SASFSENTAA NNGGAIYTEA SSFISSNKAI SFINNSVTAT 

251 SATGGAIYCS STSAPKPVLT LSDNGELNFI GNTAITSGGA I YTDNLVLS S 

35 301 GGPTLFKNNS AIDTAAPLGG AIAIADSGSL SLSALGGDIT FEGNTWKGA 

351 SSSQTTTKNS INIGNTNAKI VQLRASQGNT IYFYDPITTS ITAALSDALN 

401 LWGPDLAGNP AYQGTIVFSG EKLSEAEAAE ADNLKSTIQQ PLTIiAGGQLS 

451 LKSGVTLVAK SFSQSPGSTL LMDAGTTLET ADGITINNLV LNVDSLKETK 

501 KATLKATQAS QTVTLSGSIiS LVEPSGNVYE DVSWWNPQVF SCLTLTADDP 

40 551 AMIHITDIjAA DPIiEKNPIHW GYQGNWAIjSW QEDTATKSKA ATLTWTKTGY 

601 NPNPERRGTL VANTLWGSFV DVRSIQQLVA TKVRQSQETR GIWCEGISNF 

651 FHKDSTKINK GFRHISAGYV VGATTTLASD NLITAAFCQL FGKDRDHFIN 

701 KWRASAYAAS LHLQHLATIiS SPSKLRYLFG SESEQPVL.FD AQISYIYSKN 

751 TMKTYYTQAP KGESSWYNDG CALELASSLP HTALSHEGLF HAYFPFIKVE 

45 801 ASYIHQDSFK ERNTTLVRSF DSGDLINVSV PIGITFERFS RHBRASYEAT 

851 VIYVADVYRK NPDCTTALLI NNTSWKTTGT NLSROAGIGR AGIFYAFSPN 

901 IiEVTSNLSMB IRGSSRSYNA DLGGKFQF* 

A predicted signal peptide is highlighted. 

The cp6731 nucleotide sequence <SEQ ID 34> is: 

50 1 ATGAAATCCT CTCTTCATTG GTTTTTAATC TCGTCATCTT TAGCACTTCC 

51 CTTGTCACTA AATTTCTCTG CGTTTGCTGC TGTTGTTGAA ATCAATCTAG 

101 GACCTACCAA TAGCTTCTCT GGACCAGGAA CCTACACTCC TCCAGCCCAA 

151 ACAACAAATG CAGATGGAAC TATCTATAAT CTAACAGGGG ATGTCTCAAT 

201 CACCAATGCA GGATCTCCGA CAGCTCTAAC CGCTTCCTGC TTTAAAGAAA 
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251 CTACTGGGAA TCTTTCTTTC CAAGGCCACG GCTACCAATT TCTCCTACAA 

301 AATATCGATG CGGGAGCGAA CTGTACCTTT ACCAATACAG CTGCAAATAA 

351 GCTTCTCTCC TTTTCAGGAT TCTCCTATTT GTCACTAAOA CAAACCACGA 

401 ATGCTACCAC AGGAACAGGA GCCATCAAGT CCACAGGAGC TTGTTCTATT 

5 451 CAGTCGAACT ATAGTTGCTA CTTTGGCCAA AACTTTTCTA ATGACAATGG 

501 AGGCGCCCTC CAAGGCAGCT CTATCAGTCT ATCGCTAAAC CCCAACCTAA 

551 CGTTTGCCAA AAACAAAGCA ACGCAAAAAG GGGGTGCCCT CTATTCCACG 

601 GGAGGGATTA CAATTAACAA TACGTTAAAC TCAGCATCAT TTTCTGAAAA 

651 TACCGCGGCG AACAATGGCG GAGCCATTTA CACGGAAGCT AGCAGTTTTA 

10 7 01 TTAGCAGCAA CAAAGCAATT AGCTTTATAA ACAATAGTGT GACCGCAACC 

751 TCAGCTACAG GGGGAGCCAT TTACTGTAGT AGTACATCAG CCCCCAAACC 

801 AGTCTTAACT CTATCAGACA ACGGGGAACT GAACTTTATA GGAAATACAG 

851 CAATTACTAG TGGTGGGGCG ATTTATACTG ACAATCTAGT TCTTTCTTCT 

901 GGAGGACCTA CGCTTTTTAA AAACAACTCT GCTATAGATA CTGCAGCTCC 

15 951 CTTAGGAGGA GCAATTGCGA TTGCTGACTC TGGATCTTTG AGTCTTTCGG 

1001 CTCTTGGTGG AGACATCACT TTTGAAGGAA ACACAGTAGT CAAAGGAGCT 

1051 TCTTCGAGTC AGACCACTAC CAGAAATTCT ATTAACATCG GAAACACCAA 

1101 TGCTAAGATT GTACAGCTGC GAGCCTCTCA AGGCAATACT ATCTACTTCT 

1151 ATGATCCTAT AACAACTAGC ATCACTGCAG CTCTCTCAGA TGCTCTAAAC 

20 1201 TTAAATGGTC CTGACCTTGC AGGGAATCCT GCATATCAAG GAACCATCGT 

1251 ATTTTCTGGA GAGAAGCTCT CGGAAGCAGA AGCTGCAGAA GCTGATAATC 

1301 TCAAATCTAC AATTCAGCAA CCTCTAACTC TTGCGGGAGG GCAACTCTCT 

1351 CTTAAATCAG GAGTCACTCT AGTTGCTAAG TCCTTTTCGC AATCTCCGGG 

1401 CTCTACCCTC CTCATGGATG CAGGGACCAC ATTAGAAACC GCTGATGGGA 

25 14 51 TCACTATCAA TAATCTTGTT CTCAATGTAG ATTCCTTAAA AGAGACCAAG 

1501 AAGGCTACGC TAAAAGCAAC ACAAGCAAGT CAGACAGTCA CTTTATCTCG 

1551 ATCGCTCTCT CTTGTAGATC CTTCTGGAAA TGTCTACGAA GATGTCTCTT 

1601 GGAATAACCC TCAAGTCTTT TCTTGTCTCA CTCTTACTGC TGACGACCCC 

1651 GCGAATATTC ACATCACAGA CTTAGCTGCT GATCCCCTAG AAAAAAATCC 

30 1701 TATCCATTGG GGATACCAAG GGAATTGGGC ATTATCTTGG CAAGAGGATA 

1751 CTGCGACTAA ATCCAAAGCA GCGACTCTTA CCTGGACAAA AACAGGATAC 

18 01 AATCCGAATC CTGAGCGTCG TGGAACCTTA GTTGCTAACA CGCTATGGGG 

1851 ATCCTTTGTT GATGTGCGCT CCATACAACA GCTTGTAGCC ACTAAAGTAC 

1901 GCCAATCTCA AGAAACTCGC GGCATCTGGT GTGAAGGGAT CTCGAACTTC 

35 1951 TTCCATAAAG ATAGCACGAA GATAAATAAA GGTTTTCGCC ACATAAGTGC 

2001 AGGTTATGTT GTAGGAGCGA CTACAACATT AGCTTCTGAT AATCTTATCA 

2051 CTGCAGCCTT CTGCCAATTA TTCGGGAAAG ATAGAGATCA CTOTATAAAT 

2101 AAAAATAGAG CTTCTGCCTA TGCAGCTTCT CTCCATCTCC AGCATCTAGC 

2151 GACCTTGTCT TCTCCAAGCT TGTTACGCTA CCTTCCTGGA TCTGAAAGTG 

40 2201 AGCAGCCTGT CCTCTTTGAT GCTCAGATCA GCTATATCTA TAGTAAAAAT 

2251 ACTATGAAAA CCTATTACAC CCAAGCACCA AAGGGAGAGA GCTCGTGGTA 

2 3 01 TAATGACGGT TGCGCTCTGG AACTTGCGAG CTCCCTACCA CACACTGCTT 

2 351 TAAGCCATGA GGGTCTCTTC CACGCGTATT TTCCTTTCAT CAAAGTAGAA 

2401 GCTTCGTACA TACACCAAGA TAGCTTCAAA GAACGTAATA CTACCTTGGT 

45 2451 ACGATCTTTC GATAGCGGTG ATTTAATTAA CGTCTCTGTG CCTATTGGAA 

2501 TTACCTTCGA GAGATTCTCG AGAAACGAGC GTGCGTCTTA CGAAGCTACT 

2551 GTCATCTACG TTGCCGATGT CTATCGTAAG AATCCTGACT GCACGACAGC 

2 601 TCTCCTAATC AACAATACCT CGTGGAAAAC TACAGGAACG AATCTCTCAA 

2651 GACAAGCTGG TATCGGAAGA GCAGGGATCT TTTATGCCTT CTCTC CAAAT 

50 2701 CTTGAGGTCA CAAGTAACC T ATCTATGGAA ATTCGTGGAT CTTCACGCAG 

2751 CTACAATGCA GATCTTGGAG GTAAGTTCCA GTTCTAA 

The PSORT algorithm predicts an outer membrane location (0.926). 

The protein was expressed in Rcoli and purified as a his-tag product, as shown in Figure 17A. A 
GST-fusion protein was also expressed. The recombinant proteins were used to immunise mice, 

55 whose sera were used in a Western blot (Figure 17B; his-tag) and for FACS analysis (Figure 17C; 
his-tag and GST-fusion). 

The GST-fusion protein also showed good cross-reactivity with human sera, including sera from 
patients with pneumonitis. Less cross-reactivity was seen with the his-fusion. 
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These experiments show that cp6731 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 18 

The following C.pneumoniae protein (pid 4376737) was expressed <SEQ ID 35; cp6737>: 



10 



15 



20 



25 



X 


MPLSFKSSSF 


51 


CSNFLGASFS 


101 


LTFKNFSSIW 


151 


TTSATFAITT 


201 


TAWKFINNT 


251 


NSIiKGVTPSS 


301 


IYAETCNIVG 


351 


KGGAIFIGPS 


401 


AGGEIVSLSA 


451 


SKGLSSTELL 


501 


LTLGSGGTLG 


551 


NVTLTGAIATL 


601 


ATPSHYGYQG 


651 


RSTYILDPER 


701 


LGAYVEHTPR 


751 


WPYDSRCSEQ 


801 


PDKAPKSQGQ 


851 


LGGWQSKFTE 


901 


LAYKPDIYRV 


951 


TQAFIiNYTFD 



CLIACLCSAS 



SSFINSSSNL 
FTGNQSTGLG 
VTTGASALQP 
ATMSFSHNFT 
GTYALGSGGA 
NQGALLLDSN 
VGDPAKQTST 
QGGSRXjVFYD 
LPANTTTILL 
LAT PTGAPAA 
DEHDVTDLYD 
KWSYTWSRPL 
YGEIVSNSItW 
QGHEGFSGRY 
MYLLSFFGQF 
WHNWSYYVLI 
TGDLQRSFSR 
NPHNIVTWS 
GKNGFTKHRV 



CAFAETRLGG 

SLLGKGLSIiT 
GLIYGKDIVF 
TDSLTVENIS 
SSGGGVIYGG 
ICIPTGTFEL 
TAARNGGAIC 
LTILASEGDI 
PITHSLPTTS 
GTVKIASGEL 
VDFTIGKLAF 
MVSLQTPVAI 
LIPAPDGGFP 
XSFLGNQAFS 
GGYQAALSMN 
PIVTQKSEAL 
SAEHPFLNWC 
GKGYNVSIiPI 
NQESTSISGA 
STGLKSTF* 



NFVPPITNQG 
FTSCQAPTNS 
QSIKDLIFTT 
QSIKFFGNLA 
SSLLFEMNSG 
KNNQGKCTFS 
AKVLNIQGRG 
AFQGNMLNTK 
PSWKDITINA 
KITDNAWNV 
DPFSFLKRDF 
P I AVFKGATV 
GGPSPSANTL 
DILQDVLLID 
YTOHTTLGLS 
ISWKAAYGYS 
LLTRPLAQAW 
GCSSQWFTPF 
NLRRHGLFVQ 



EEILLTSDFV 
NYALLSAAET 
WRVAYSPASV 
NFGSAISSSP 
CIIFTANSCV 
YNGTPNDAGA 
PIEFSRNRAE 
PGIRNAITVE 
NGASGSWFT 
LGFATQGSGQ 
VSASVNAGTK 
TKTGFPDGEI 
YAVWNSDTliV 
HPGLSITAKA 
FGQliYGKTNA 
KNHLNTTYLR 
DLSGFISAEF 
KKAPSTLTIK 
IHDWDLTED 



A predicted signal peptide is highlighted. 

The cp6737 nucleotide sequence <SEQ ED 36> is: 



30 



35 



40 



45 



50 



55 



60 



l 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 



ATGCCTCTTT 
TAGTGCAAGT 
CTCCAATTAC 
TGTTCAAACT 
CAGCAATCTC 
GTCAAGCTCC 
CTGACCTTCA 
AGGACTTGGC 
AAGATTTGAT 
ACTACGTCGG 
TCTCCAACCT 
AGTTTTTTGG 
ACGGCAGTCG 
TAACTTTACT 
, TTTTTGAAAA 
AACAGCTTAA 
TGGCGGAGCC 
AGGGGAAGTG 
ATCTACGCCG 
AGATAGCAAC 
TCAATATTCA 
AAGGGTGGAG 
AACATCGACA 
GAAACATGCT 
GCAGGGGGAG 
ATTTTATGAT 
AAGACATTAC 
AGTAAGGGAC 
TATACTTCTA 
ACAATGCGGT 
CTTACCCTGG 
ACCTGCCGCT 
CCTTCCTAAA 
AACGTCACTT 



CTTTCAAATC 
TGCGCGTTTG 
GAATCAGGGT 
TCTTGGGGGC 
TCCTTATTAG 
TACAAATAGT 
AGAATTTTTC 
GGCCTCATCT 
CTTCACTACG 
CAACTCCCGC 
ACAGACTCAC 
GAACCTTGCC 
TTAAATTCAT 
TCGTCAGGAG 
CAATTCTGGA 
AAGGCGTCAC 
ATCTGCATCC 
CACCTTCTCT 
AAACCTGCAA 
ACTGCAGCGA 
AGGACGCGGT 
CTATTTTCAT 
CTTACGATTT 
CAATACAAAA 
AGATTGTGTC 
CCCATTACAC 
AATCAACGCT 
TCTCCTCTAC 
GGAACAGTCA 
TGTCAATGTT 
GCTCTGGAGG 
GTAGACTTTA 
AAGAGATTTT 
TAACAGGAGC 



TTCATCTTTT 
CTGAGACTAG 
GAAGAGATCT 
GAGTTTTTCA 
GGAAGGGCCT 
AACTATGCGC 
TTCTATAAAC 
ACGGAAAAGA 
AACCGTGTTG 
AATCACTACA 
TCACTGTCGA 
AACTTCGGCT 
CAATAACACC 
GCGGCGTGAT 
TGCATCATCT 
CCCTTCATCA 
CTACGGGAAC 
TATAATGGTA 
CATCGTAGGG 
GAAATGGCGG 
CCTATTGAAT 
AGGCCCCTCT 
TGGCTTCCGA 
CCTGGAATCC 
TCTATCTGCA 
ATAGCCTCCC 
AATGGCGCTT 
AGAACTCCTG 
AGATC GCTAG 
CTTGGCTTCG 
AACCTTAGGG 
CGATTGGAAA 
GTTTCAGCAT 
TCTGGTTCTT 



TGTCTACTTG 
ACTCGGAGGG 
TACTCACTTC 
AGTTCCTTTA 
TTCCTTAACG 
TACTTTCTGC 
TTTACAGGGA 
TATTGTTTTC 
CCTATTCTCC 
GTAACTACAG 
AAACATATCC 
CTGCAATTAG 
GCTACCATGA 
TTATGGAGGA 
TCACCGCCAA 
GGAACCTATG 
TTTCGAATTA 
C AC CAAATGA 
AACCAGGGTG 
AGCCATCTGT 
TCTCTAGAAA 
GTTGGAGACC 
AGGTGATATT 
GCAATGCCAT 
CAAGGAGGCT 
AACCACAAGT 
CAGGATCTGT 
TTGCCTGCCA 
TGGAGAACTG 
CTACTCAGGG 
CTGGCAACAC 
GTTAGCATTC 
CAGTAAATGC 
GATGAACATG 



CCTGTTTATG 
AACTTTGTTC 
AGATTTTGTT 
TCAATAGTTC 
TTTACCTCTT 
CGCAGAGACT 
ACCAATCGAC 
CAATCTATCA 
AGCATCTGTA 
GAGCCTCTGC 
CAATCGATCA 
CAGTTCTCCC 
GCTTCTCCCA 
AGCTCTCTCC 
CTCCTGTGTG 
CTTTAGGAAG 
AAAAACAATC 
TGCGGGTGCG 
CCTTGCTCCT 
GCTAAAGTGC 
CCGCGCGGAG 
CTGCGAAGCA 
GCGTTCCAAG 
CACTGTAGAA 
CACGTCTTGT 
CCGTCTAATA 
AGTCTTTACA 
ACACGACAAC 
AAGATTACTG 
CTCAGGTCAG 
CCACGGGAGC 
GATCCTTTTT 
AGGCACAAAA 
ACGTTACAGA 
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1701 TCTTTATGAT ATGGTGTCAT TACAAACTCC AGTAGCAATT CCTATCGCTG 

1751 TTTTCAAAGG AGCAACCGTT ACTAAGACAG GATTTCCTGA TGGGGAGATT 

1801 GCGACTCCAA GCCACTACGG CTACCAAGGA AAGTGGTCCT ACACATGGTC 

1851 CCGTCCCCTG TTAATTCCAG CTCCTGATGG AGGATTTCCT GGAGGTCCCT 

~> 1901 CTCCTAGCGC AAATACTCTC TATGCTGTAT GGAATTCAGA CACTCTCGTG 

1951 CGTTCTACCT ATATCTTAGA TCCCGAGCGT TACGGAGAAA TTGTCAGCAA 

2001 CAGCTTATGG ATOTCCTTCT TAGGAAATCA GGCATTCTCT GATATTCTCC 

2051 AAGATGTTCT TTTGATAGAT CATCCCGGGT TGTCCATAAC CGCGAAAGCT 

n 2101 TTAGGAGCCT ATGTCGAACA CACACCAAGA CAAGGACATG AGGGCTTTTC 

IU 2151 AGGTCGCTAT GGAGGCTACC AAGCTGCGCT ATCTATGAAC TACACGGACC 

2201 ACACTACGTT AGGACTTTCT TTCGGGCAGC TTTATGGAAA AACTAACGCC - 

2251 AACCCCTACG ATTCACGTTG CTCAGAACAA ATGTATTTAC TCTCGTTCTT 

2301 TGGTCAATTC CCTATCGTGA CTCAAAAGAG CGAGGCCTTA ATTTCCTGGA 

2351 AAGCAGCTTA TGGTTATTCC AAAAATCACC TAAATAC C AC CTACCTCAGA 

2401 CCTGACAAAG CTCCAAAATC TCAAGGGCAA TGGCATAACA ATAGTTACTA 

2451 TGTTCTTATT TCTGCAGAAC ATCCTTTCCT AAACTGGTGT CTTCTTACAA 

2501 GACCTCTGGG TCAAGCTTGG GATCTTTCAG GTTTTATTTC CGCAGAATTC 

2551 CTAGGTGGTT GGCAAAGTAA GTTCACAGAA ACTGGAGATC TGCAACGTAG 

2601 CTTTAGTAGA GGTAAAGGGT ACAATGTTTC CCTACCGATA GGATGTTCTT 

2 ^ 2651 CTCAATGGTT CACACCATTT AAGAAGGCTC CTTCTACACT GACCATCAAA 

2701 CTTGCCTACA AGCCTGATAT CTATCGTGTC AACCCTCACA ATATTGTGAC 

2751 TGTCGTCTCA AACCAAGAGA GCACTTCGAT CTCAGGAGCA AATCTACGCC 

2801 GCCACGGTTT GTTTGTACAA ATCCATGATG TAGTAGATCT CACCGAGGAC 

2851 ACTCAGGCCT TTCTAAACTA TACCTTTGAC GGGAAAAATG GATTTACAAA 

25 2901 CCACCGAGTG TCTACAGGAC TAAAATCCAC ATTTTAA 

The PSORT algorithm predicts an outer membrane location (0.940). 

The protein was expressed in E.coli and purified as a GST-fusion product, as shown in Figure 18 A. 
The recombinant protein was used to immunise mice, whose sera were used in an immunoblot 
analysis blot (Figure 18B) and for FACS analysis (Figure 18C). A his-tagged protein was also 
30 expressed. 

The cp6737 protein was also identified in the 2D-PAGE experiment (Cpn0454) and showed good 
cross-reactivity with human sera, including sera from patients with pneumonitis. 

These experiments show that cp6737 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

35 Example 19 

The following C.pneumoniae protein (pid 4377 090) was expressed <SEQ ID 37; cp7090>: 

1 MWIHSLWKLC TLIALJfeALPA CSLSPNYGWE DSCNTCHHTR RKKPSSFGFV 

51 PLYTEEDFNP NFTFGEYDSK EEKQYKSSQV AAFRNITFAT DSYTIKGEEN 

101 IjAI LTNLVHY MKKNPKATLY IEGHTDERGA ASYNLALGAR RANAIKEHLR 

40 151 KQGISADRLS TISYGKEHPL NSGHNEIiAWQ QNRRTEFKIH AR* 

A predicted signal peptide is highlighted. 

The cp7090 nucleotide sequence <SEQ ID 38> is: 

1 ATGAATATAC ATTCCCTATG GAAACTTTGT ACTTTATTGG CTTTACTTGC 

51 ATTGCCAGCA TGTAGCCTTT CCCCTAATTA TGGCTGGGAG GATTCCTGTA 

45 101 ATACATGCCA TCATACAAGA CGAAAAAAGC CTTCTTCTTT TGGCTTTGTT 

151 CCTCTCTATA C CGAAGAGG A CTTTAACCCT AATTTTACCT TCGGTGAGTA 

201 TGATTCCAAA GAAGAAAAAC AATACAAGTC AAGCCAAGTI* GCAGCATTTC 

251 GTAATATCAC CTTTGCTACA GACAGCTATA CAATTAAAGG TGAAGAGAAC 

301 CTTGCGATTC TCACGAACTT GGTTCACTAC ATGAAGAAAA ACCCGAAAGC 

50 351 TACACTGTAC ATTGAAGGGC ATACTGACGA GCGTGGAGCT GCATCCTATA 

401 ACCTTGCTTT AGGAGCACGA CGAGCCAATG CGATTAAAGA GCATCTCCGA 

451 AAGCAGGGAA TCTCTGCAGA TCGTCTATCT ACTATTTCCT ACGGAAAAGA 
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501 ACATCCTTTA AATTCGGGAC ACAACGAACT AGCATGGCAA CAAAATCGCC 
551 GTACAGAGTT TAAGATTCAT GCACGCTAA 

The PSORT algorithm predicts an outer membrane location (0.790). 

The protein was expressed in Rcoli and purified as a GST-fusion product, as shown in Figure 19A. 
5 A his-tagged protein was also expressed. The recombinant proteins were used to immunise mice, 
whose sera were used in a Western blot (Figure 19B) and for FACS analysis. 

These experiments show that cp7O90 is useful immunogen. These properties are not evident from the 
sequence alone. 

Example 20 

10 The following ^pneumoniae protein (pid 4377091) was expressed <SEQ ID 39; cp7091>: 

1 MLRQLCFgVF FFCFASLVYA EELEWVRSE HITLPIEVSC QTDTKDPKIQ 

51 KYLSSIiTEIF CKDIALGDCL QPTAASKESS SPLAISLRLH VFQLSWLLQ 

101 SSKTPQTLCS FTISQNLSVD RQKIHHAADT VHYALTGIPG ISAGKIVFAL 

151 SSLGKDQKLK QGELWTTDYD GKNliAPtiTTE CSLSITPKWV GVGSNFPYLY 

15 201 VSYKYGVPKI FLGSLENTEG KKVLPLKGNQ LMPTFSPRKK LLAFVADTYG 

251 NPDLFIQPFS LTSGPMGRPR RLLNENFGTQ GNPSFNPEGS QLVFISNKDG 

3 01 RPRLYIMSLD PEPQAPRLLT KKYRNSSCPA WSPDGKKIAF CSVIKGVRQI 

3 51 CIYDLSSGED YQLTTSPTNK ESPSWAIDSR KLVFSAGNAE ESELYLISLV 

401 TKKTNK1AIG VGEKRFPSWG AFPQQPIKRT L* 

20 A predicted signal peptide is highlighted. 

The cp7091 nucleotide sequence <SEQ ID 40> is: 

1 ATGTTACGGC AACTATGCTT CCAAGTTTTT TTCTTTTGCT TCGCATCGCT 

51 AGTCTATGCT GAAGAATTAG AAGTTGTTGT CCGTTCCGAA CATATCACGC 

101 TCCCTATTGA GGTCTCTTGC CAGACCGATA CGAAAGATCC AAAAATACAG ■ • 

25 151 AAATACCTCA GCTCGCTAAC GGAGATATTT TGCAAGGACA TTGCCCTAGG 

201 AGATTGTCTA CAACCCACAG CGGCTTCTAA AGAATCGTCA TCTCCTTTAG 

251 CAATATCTTT ACGGTTGCAT GTACCTCAGC TATCTGTAGT GCTTTTACAG 

3 01 TCTTCAAAAA CTCCTCAAAC CTTATGTTCT TTTACTATTT CTCAAAATCT 

351 TTCTGTAGAT CGTCAAAAAA TCCATCACGC TGCTGATACA GTTCATTACG 

30 401 CCCTCACAGG GATTCCTGGA ATCAGTGCTG GGAAAATTGT TTTTGCTCTA 

451 AGTTCTTTAG GAAAAGATCA AAAGCTCAAG CAAGGAGAAT TATGGACTAC 

501 AGATTACGAT GGGAAAAACC TCGCCCCTTT AACCACAGAA TGTTCGCTCT 

551 CTATAACTCC AAAATGGGTG GGTGTGGGAT CAAATTTTCC CTATCTCTAT 

601 GTTTCGTATA AGTATGGTGT GCCTAAAATT TTTCTTGGTT CCCTAGAGAA 

35 651 CACTGAAGGT AAAAAAGTCC TTCCGTTAAA AGGCAACCAA CTCATGCCTA 

701 CGTTTTCTCC AAGAAAAAAG CTTTTAGCTT TCGTTGCTGA TACGTATGGA 

751 AATCCTGATT TATTTATTCA ACCGTTCTCA CTAACTTCAG GACCTATGGG 

801 TCGCCCACGT CGCCTCCTTA ATGAGAATTT CGGGACTCAA GGGAATCCCT 

851 CCTTCAACCC TGAAGGATCC CAGCTTGTCT TTATATCGAA CAAAGACGGC 

40 901 CGTCCGCGTC TTTATATTAT GTCCCTCGAT CCTGAACCCC AAGCACCTCG 

951 CTTGCTGACA AAAAAATACA GAAATAGCAG TTGCCCTGCA TGGTCTCCAG 

1001 ATGGTAAAAA AATAGCCTTC TGCTCTGTAA TTAAAGGGGT GCGACAAATT 

1051 TGTATTTACG ATCTCTCCTC TGGAGAGGAT TACCAACTCA CTACGTCTCC 

1101 CACAAATAAA GAGAGTCCTT CTTGGGCTAT AGACAGCCGT CATCTTGTCT 

45 1151 TTAGTGCGGG GAATGCTGAA GAATCAGAGT TATATTTAAT CAGTCTAGTC 

1201 ACCAAAAAAA CTAACAAAAT TGCTATAGGA GTAGGAGAAA AACGGTTCCC 

1251 CTCCTGGGGT GCTTTCCCTC AGCAACCGAT AAAGAGAACA CTATGA 

The PSORT algorithm predicts an inner membrane location (0.109). 

The protein was expressed in Rcoli and purified as a GST-fusion product, as shown in Figure 20A. 
50 A his-tagged protein was also expressed. The recombinant proteins were used to immunise mice, 
whose sera were used in a Western blot (Figure 20B) and for FACS analysis. 
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These experiments show that cp7091 is a useful immunogen. These properties are not evident from 
the sequence alone. 

Example 21 

The following C.pneumoniae protein (pid 437 6260) was expressed <SEQ ID 41; cp6260>: 



1 


MRFSLCGFPIt 


51 


QAGDVYSLTG 


101 


GTTKEGAVLC 


151 


NNYWRFEQN 


201 


PIjQIAVNQAE 


251 


AIGKGGAVCC 


301 


SISSGGPTLF 


351 


SLPFLNGIHL 


401 


NKEYTGTILF 


451 


FTQSPGSHLV 


501 


NKQISVTDSI 


551 


FLPVSPHYGF 


601 


WGNAVDVRSL 


651 


SGGYVLSVNN 


701 


TTSLGNIFRY 


751 


YANFPMVKNS 


801 


QGDFKETTAD 


851 


IFKKDPSCEA 


901 


GSIECRPHAR 



VFSFTLLSVF PTSLSATTIS 



DVSISNVDNS 
CQDPQATARF 
QSKTKGGAIS 
IRFAQNTAKN 
LPTSGSSTPV 
INNISYANSQ 
LQNAKFLKLQ 
SGEKSLANDP 
LDLGTKLIAS 
ELISPTGNAY 
QGNWKLAWTG 
MQVQETHASS 
EITPKHYTSM 
ASRHPNVNVG 
WRNNCWAIEC 
GRRFSNGSLT 
ALVISGDSWL 
NYNIWCGSKF 



ALNKACFWT 
SGFSTLSFIQ 
GAWTIVGNY 
GSGGALYSDG 
PIVTFSDNKQ 
NLGGAIAIDT 
ARNGYSIEFY 
KDFKSTIPQN 
KEDIAXTGIiA 
EDLRMRHSQT 
TGNKVGEFFW 
LQTDRGLWID 
AFSQLFSRDK 
ILSRRFtiQKfP 
GGSMPLIjVFE 
SISVPLGIRF 
VPAAHVSRHA 
RF* 



LTPEDSFHGD 
SGSVTFAGNH 
SPGDIKEQGC 
DSVSFYQNAA 
DIDIDQNAYV 
LVFERNHSIM 
GGEI SLSAEK 
DPITSEADGS 
VNLSAGYLVI 
IDIDSLSSSS 
FPLLSLEPGA 
DKINYKPRPE 
GXGNFFHVSA 
DYAVSNNEYR 
LMIFHFLCAY 
NGRLFQGAIP 
EKLAliSQDVIi 
FVGSGTGRYH 



SQNAERSYW 
HGLYFWNISS 
LYSKNALMLL 
TFGGAIHSSG 
LFRENEALTT 
GGGAIYARKL 
GTITFQGNRT 
TQLNIWGDPK 
KEGAEVTVSK 
TAAVIKANTA 
GGSVTVTAGD 
KEGNLVPNIL 
SEDNIRYRHN 
MYIiGSYLYQY 
GHATNDMKTD 
FMKLQIiVYAY 
YDFSFSYIPD 
FNDYTEIdJCR 



A predicted signal peptide is highlighted. 

The cp6260 nucleotide sequence <SEQ ID 42> is: 



i 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 



ATGCGATTTT 
CTCAGTCTTC 
AAGATAGTTT 
CAAGCTGGGG 
CGATAACTCT 
TGACGTTCGC 
GGAACTACAA 
GGCACGTTTT 
ATATTAAAGA 
AACAATTATG 
AGCTATTAGT 
CTTTCTATCA 

CCCCTACAGA 
TGCCAAGAAT 
TTGATCAGAA 
GCTATAGGTA 
TACTCCAGTT 
AAAGAAACCA 
AGCATCTCTT 
AAATTCGCAA 
TCAGTTTATC 
AGCTTACCGT 
GAAATTACAG 
CTTCTGAAGC 
AATAAAGAGT 
AAACGATCCT 
CTGCAGGATA 
TTCACGCAGT 
GATAGCCTCT 
ATAGCTTAAG 
AATAAACAGA 
CAATGCCTAT 
TC TCTTTAGA 
TTCCTACCGG 
TTGGACAGGA 



CGCTCTGCGG 
GACACTTCTT 
TCATGGAGAT 
ATGTCTATAG 
GCATTAAATA 
AGGAAATCAT 
AGGAAGGGGC 
TCTGGGTTCT 
ACAGGGATGT 
TAGTGCGTTT 
GGGGCGAATG 
GAATGCAGCC 
TTGCAGTAAA 
GGTTCTGGAG 
TGCTTATGTT 
AGGGAGGGGC 
CCTATTGTGA 
TTCCATAATG 
CAGGAGGTCC 
AATTTAGGTG 
AGCAGAGAAA 
TTTTGAATGG 
GCGAGAAATG 
AGATGGGTCT 
ACACAGGGAC 
AGGGATTTTA 
CTTAGTTATT 
CTCCAGGATC 
AAGGAAGACA 
CTCATCCTCA 
TATCCGTGAC 
GAAGATCTCA 
GCCTGGAGCC 
TAAGTCCCCA 
ACTGGAAACA 



ATTTCCTCTA 
TGAGTGCTAC 
AGTCAGAATG 
CCTTACTGGT 
AAGCCTGCTT 
CATGGGTTAT 
TGTACTTTGT 
CCACGCTCTC 
CTCTATTCAA 
TGAACAAAAC 
TTACTATAGT 
ACTTTTGGAG 
TCAGGCAGAG 
GGGCTTTGfA 
CTATTTCGAG 
TGTCTGTTGT 
CTTTCTCTGA 
GGTGGCGGAG 
TACTCTATTT 
GAGCTATTGC 
GGAACAATTA 
CATCCATCTT 
GATACTCTAT 
ACCCAATTGA 
CATACTCTTT 
AATCTACAAT 
AAAGAGGGGG 
GCATTTAGTf 
TTGCCATCAC 
ACAGCAGCTG 
GGACTCTATA 
GAATGAGAAA 
GGGGGTAGTG 
TTATGGTTTT 
AAGTTGGAGA 



GTTTTTTCTT 
TACGATTTCT 
CAGAACGTTC 
GATGTCTCAA 
CAATGTGACC 
ATTTTAATAA 
TGCCAAGATC 
TTTTATTCAG 
AAAATGCACT 
CAAAGTAAGA 
AGGCAACTAC 
GTGCTATCCA 
ATAAGATTTG 
CTCCGATGGT 
AAAATGAGGC 
CTTCCCACTT 
CAATAAACAG 
CCATTTATGC 
ATCAATAATA 
CATTGATACT 
CATTCCAAGG 
TTACAAAATG 
AGAATTTTAT 
ATATCAACGG 
TCTGGAGAAA 
CCCTCAGAAC 
CCGAAGTCAC 
TTAGATTTAG 
AGGCCTCGCG 
TTATTAAAGC 
GAACTTATCT 
TTCACAGACG 
TGACTCTAAC 
CAAGGCAATT 
ATTCTTCTGG 



TTACATTGCT 
TTAACCCCAG 
TTATAATGTT 
TATOTAACGT 
TCAGGAAGTG 
TATTTCCTCA 
CTCAAGCAAC 
AGCCCCGGAG 
TATGCTCTTA 
CTAAAGGCGG 
GATTCCGTCT 
TTCTTCAGGT 
CACAAAATAC 
GATATTGATA 
ATTGACTACT 
CAGGAAGTAG 
TTAGTCTTTG 
TAGGAAACTT 
TATCATATGC 
GGAGGGGAGA 
AAACCGGACG 
CTAAATTCCT 
GATCCTATTA 
AGATCCTAAA 
AGAGTCTAGC 
GTCAACCTGT 
AGTTTCAAAA 
GAAC CAAACT 
ATAGATATAG 
AAACACCGCA 
CGCCTACTGG 
TTCCCTCTGC 
TGCTGGAGAT 
GGAAATTAGC 
GATAAAATAA 
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5 
10 
15 
20 

25 

30 

35 
40 

45 
50 
55 
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1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 
2701 
2751 



ATTATAAGCC 
TGGGGGAATG 
TGCATCGAGC 
ATTTCTTCCA 
AGCGGTGGAT 
TACTTCGATG 
TTTCCAACAA 
ACAACCTCCC 
AAACGTCGGG 
TTCATTTTTT 
TACGCAAATT 
TATAGAGTGC 
TTTTCCAAGG 
CAGGGAGATT 
GAGTTTAACA 
CACTTTCTCA 
ATTTTCCGTA 
CTCCTGGCTT 
GTGGAACGGG 
GGAAGTATAG 
AAGCAAATTT 



TAGACCTGAA 
CTGTAGATGT 
TTACAGACAG 
TGTATCTGCC 
ATGTTCTATC 
GCATTTTCCC 
CGAATACAGA 
TAGGGAATAT 
ATTCTCTCAA 
GTGTGCTTAT 
TCCCTATGGT 
GGAGGGAGCA 
TGCCATCCCA 
TCAAAGAGAC 
TCGATTTCTG 
GGATGTACTC 
AGGATCCCTC 
GTTCCGGCAG 
TCGGTATCAC 
AATGCCGCCC 
CGTTTTTAG 



AAAGAAGGAA 
CAGATCCTTA 
ATCGAGGGCT 
TCCGAAGACA 
TGTAAATAAT 
AACTCTTTAG 
ATGTATTTAG 
TTTCCGTTAT 
GAAGGTTTCT 
GGTCATGCCA 
GAAAAACAGC 
TGCCTCTATT 
TTTATGAAAC 
GACTGCAGAT 
TACCTCTAGG 
TATGACTTTA 
ATGTGAAGCT 
CACACGTATC 
TTTAACGACT 
CCATGCTAGG 



ATTTAGTTCC 
ATGCAGGTTC 
GTGGATCGAT 
ATATAAGGTA 
GAGATCACAC 
TAGAGACAAG 
GATCGTATCT 
GCTTCGCGTA 
TCAAAATCCT 
CCAATGATAT 
TGGAGAAACA 
GGTATTTGAG 
TACAATTAGT 
GGCCGTAGAT 
CATACGCTTT 
GTTTCTCCTA 
GCTCTGGTGA 
AAGACATGCT 
ATACTGAGCT 
AATTATAATA 



TAATATCTTG 
AAGAGACCCA 
GGAATTGGGA 
CCGTCATAAC 
CTAAGCACTA 
GACTATGCGG 
CTATCAATAT 
ACCCTAATGT 
CTTATGATTT 
GAAAACAGAC 
ATTGTTGGGC 
AACGGAAGAC 
TTATGCTTAT 
TTAGTAATGG 
GAGAAGCTGG 
TATTCCTGAT 
TTAGCGGAGA 
TTTGTAGGGA 
CTTATGTCGA 
TAAACTGTGG 



The PSORT algorithm predicts an outer membrane location (0.921). 

The protein was expressed in Rcoli and purified both as a his-tag and GST-fusion product. The GST- 
fusion is shown in Figure 21A. This recombinant protein was used to immunise mice, whose sera 
were used in a Western blot (Figure 21B) and for FACS analysis (Figure 21C). 

This protein also showed good cross-reactivity with human sera, including sera from patients with 
pneumonitis. 

These experiments show that cp6260 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 22 



The following C.pneumoniae protein (pid 4376456) was expressed <SEQ ID 43; cp6456>: 

1 MSSPV13NTPS APNIPIPAPT TPGIPTTKPR SSFIEKVIIV AKYILFAIAA 

51 TSGALGTILG LSGALTPGIG IALLVIFFVS MVLLGLILKD SISGGEERRL 

101 REEVSRFTSE NQRLTVTTTT LETEVKDLKA AKDQLTLEIE AFRNENGNLK 

151 TTAEDLEEQV SKLSEQLEAL ERINQIalQAN AGDAQEISSE IiKKLISGWDS 

201 KWEQINTSI QALKVLLGQE WQEAQTHVK AMQEQIQALQ AEILGMHNQS 

251 TALQKSVENIj LVQDQALTRV VGELLESENK LSQACSALRQ E I EKXiAQHET 

3 01 SLQQRIDAMXi AQEQNIiAEQV TALEKMKQEA QKAESEFIAC VRDRTFGRRE 

351 TPPPTTPWE GDESQEEDEG GTPPVSQPSS PVDRATGDGQ * 

The cp6456 nucleotide sequence <SEQ ID 44> is: 

1 ATGTCATCTC CTGTAAATAA CACACCCTCA GCACCAAACA TTCCAATACC 

51 AGCGCCCACG AC ICCAGGTA TTCCTACAAC AAAACCTCGT TCTAGTTTCA 

101 TTGAAAAGGT TATCATTGTA GCTAAGTACA TACTATTTGC AATTGCAGC C 

151 ACATCAGGAG CACTCGGAAC AATTCTAGGT CTATCTGGAG CGCTAACCCC 

201 AGGAATAGGT ATTGCCCTTC TTGTTATCTT CTTTGTTTCT ATGGTGCTTT 

251 TAGGTTTAAT CCTTAAAGAT TCTATAAGTG GAGGAGAAGA ACGCAGGCTC 

3 01 AGAGAAGAGG TCTCTCGATT TACAAGTGAG AATCAACGGT TGACAGTCAT 

351 AACCACAACA CTTGAGACTG AAGTAAAGGA TTTAAAAGCA GCTAAAGATC 

401 AACTTACACT TGAAATCGAA GCATTTAGAA ATGAAAACGG TAATTTAAAA 

451 ACAACTGCTG AGGACTTAGA AGAGCAGGTT TCTAAACTTA GCGAACAATT 

501 AGAAGCACTA GAGCGAATTA ATCAACTTAT CCAAGCAAAC GCTGGAGATG 

551 CTCAAGAAAT TTCGTCTGAA CTAAAGAAAT TAATAAGCGG TTGGGATTCC 

601 AAAGTTGTTG AACAGATAAA TACTTCTATT CAAGCATTGA AAGTGTTATT 

651 GGGTCAAGAG TGGGTGCAAG AGGCTCAAAC ACACGTTAAA GCAATGCAAG 

701 AGCAAATTCA AGCATTGCAA GCTGAAATTC TAGGAATGCA CAATCAATCT 
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751 ACAGCATTGC AAAAGTCAGT TGAGAATCTA TTAGTACAAG ATCAAGCTCT 

801 AACAAGAGTA GTAGGTGAGT TGTTAGAGTC TGAGAACAAG CTAAGCCAAG 

851 CTTGTTCTGC GCTACGTCAA GAAATAGAAA AGTTGGCCCA ACATGAAACA 

901 TCTTTGCAAC AACGTATTGA TCCGATGCTA GCCCAAGAGC AAAATTTGGC 

951 AGAGCAGGTC ACAGCCCTTG AAAAAATGAA ACAAGAAGCT CAGAAGGCTG 

1001 AGTCCGAGTT CATTGCTTGT GTACGTGATC GAACTTTCGG ACGTCGTGAA 

1051 ACACCTCCAC CAACAACACC TGTAGTTGAA GGTGATGAAA GTCAAGAAGA 

1101 AGACGAAGGA GGTACTCCCC CAGTATCACA ACCATCTTCA CCCGTAGATA 

1151 GAGCAACAGG AGATGGTCAG TAA 

1 0 The PSORT algorithm predicts inner membrane (0. 1 27). 

The protein was expressed in Kcoli and purified as a GST-fusion product, as shown in Figure 22A, 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 22B) and for FACS analysis (Figure 22C). A his-tag protein was also expressed. 

These experiments show that cp6456 is a surface-exposed and immunoaccessible protein, and that it 
15 is a useful immunogen. These properties are not evident from the sequence alone. 

Example 23 

The following ^pneumoniae protein (pid 4376729) was expressed <SEQ ID 45; cp6729>: 

1 MKIPLHKLIiI SSTLVTPIIiIi S X ATYG A DAS LSPTDSFDGA GGSTFTPKST 

9n 51 ADAWGTNYVL SGNVYXNDAG KGTALTGCCF TETTGDLTFT GKGYSFSFNT 

2,0 101 VDAGSNAGAA ASTTADKALT FTGFSNLSFI AAPGTTVASG KSTLSSAGAL 

151 NLTDNGTILF SQNVSNEANN NGGA2TTKTX* SISGNTSSIT FT SNSAKKIjG 

201 GAIYSSAAAS ISGNTGQLVF MNNKGETGGG ALtGFEASSSI TQNSSLFFSG 

251 NTATDAAGKG GAIYCEKTGE TPTLTISGNK SLTFAENSSV TQGGAICAHG 

301 LBLSAAGPTIi FSNNRCGNTA AGKGGAIAIA DSGSLSLSAN QGDITFLGNT 

l ^ 351 LTSTSAPTST RNAIYLGSSA KITNLRAAQG QSIYFYDPIA SNTTGASDVL 

401 TINQPDSNSP LDYSGTIVFS GEKLSADBAK AADNFTSILK QPLALASGTL 

451 ALKGNVELDV NGFTQTEGST LLMQ PGTKLK ADTEAI SLTK LWDLSALEG 

501 NKSVSIETAG ANKTITLTSP LVFQDSSGNF YESHT INQAF TQPLWFTAA 

551 TAASDIYIDA IiLTSPVQTPE PHYGYQGHWE ATWADTSTAK SGTMTWVTTG 

601 YNPNPERRAS WPDSliWASF TDIRTLQQIM TSQANSIYQQ RGIiWASGTAN 

651 FFHKDKSGTN QAFRHKSYGY IVGGSAEDFS ENIFSVAFCQ LFGKDKDLFI 

701 VENTSHNYIiA SIiYLQHRAFL GGLPMPSFGS ITDMLKDIPL ILNAQLSYSY 

751 TKNDMDTRYT SYPEAQGSWT NNSGALELGG SLALYLPREA PFFQGYFPFL 

801 KFQAVYSRQQ NFKESGAEAR AFDDGDLVNC SIPVGIRIiEK ISEDEKN13FE 

35 851 ISLAYIGDVY RKNPRSRTSI* MVSGASWTSL CKNLARQAFL ASAGSHLTLS 

901 PHVELSGEAA YELRGSAHIY NVDCGLRYSF * 

A predicted signal peptide is highlighted. 

The cp6729 nucleotide sequence <SEQ ID 46> is: 
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55 



1 


ATGAAAATAC 


51 


CATTCTATTG 


101 


CAGATAGCTT 


151 


GCAGATGCCA 


201 


CGATGCTGGG 


251 


CGGGTGATCT 


301 


GTAGATGCGG 


351 


AGCCCTAACA 


401 


GAACTACAGT 


451 


AATCTTACCG 


501 


AGCTAATAAC 


551 


GGAATACCTC 


601 


GGAGCGATCT 


651 


GTTAGTCTTT 


701 


TTGAAGCCAG 


751 


AACACTGCAA 


801 


AACAGGAGAG 


851 


TCGC CGAGAA 



TTCACTAGTA ATAGCGCAAA AAAATTAGGT 



BNSDOCID: <WO 0202606A2_L> 



WO 02/02606 



PC77IB01/01445 
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10 



15 



20 



25 



30 



35 



901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 
2701 
2751 



CTAGATCTTT 
GAACACAGCT 
CTTTAAGTCT 
CTAACCTCAA 
ATCGTCAGCA 
ATTTC TATGA 
ACCATCAACC 
TGTATTTTCT 
ACTTCACATC 
GCACTCAAAG 
AGGCTCTACA 
AAGCTATCAG 
AATAAGAGTG 
AACCTCTCCT 
ATACGATAAA 
ACTGCTGCTA 
AACTCCAGAA 
CAGACACATC 
TACAACCCTA 
GGCATCCTTT 
CGAATAGTAT 
TTCTTCCATA 
CTACGGCTAT 
TCAGTGTAGC 
GTTGAAAATA 
AGCATTCCTA 
TGCTGAAAGA 
ACTAAAAATG 
CTCTTGGACC 
TATATCTCCC 
AAGTTCCAGG 
TGAAGCCCGT 
TCGGCATTCG 
ATTTCTCTAG 
TACTTCTCTA 
TCGCACGACA 
CCTCATGTAG 
ACACATCTAC 



CCGCTGCTGG 
GCAGGCAAGG 
CTCTGCAAAT 
CCTCCGCGCC 
AAAATTACGA 
TCCGATTGCA 
AACCGGATAG 
GGGGAAAAGC 
TATATTAAAG 
GAAATGTCGA 
CTCCTCATGC 
TCTTACCAAA 
TGTCCATTGA 
CTTGTTTTCC 
CCAAGCCTTC 
GCGATATTTA 
CCTCATTACG 
AACTGCAAAA 
ATCCTGAGCG 
ACTGACATTC 
CTATCAGCAA 
AGGATAAATC 
ATTGTTGGAG 
TTTCTGCCAG 
CCTCTCATAA 
GGAGGACTTC 
TATTCCTCTC 
ATATGGATAC 
AATAACTCTG 
TAAAGAAGCA 
CAGTCTACAG 
GCTTTTGATG 
GTTAGAAAAA 
CCTACATTGG 
ATGGTCAGTG 
AGCCTTCTTA 
AACTCTCTGG 
AATGTAGATT 



CCCTACCCTA 
GCGGCGCTAT 
CAAGGAGACA 
AACATCGACA 
ACTTAAGGGC 
TCTAACACCA 
CAACTCGCCT 
TCTCTGCAGA 
CAACCA1TGG 
GTTAGATGTC 
AACCAGGAAC 
CTTGTCGTTG 
AACAGCAGGA 
AAGATAGTAG 
ACGCAGCCTT 
TATCGATGCG 
GGTATCAGGG 
TCAGGAACTA 
TAGAGCTTCC 
GCACTCTACA 
CGAGGACTCT 
AGGAACTAAC 
GAAGTGCTGA 
CTCTTCGGTA 
CTATTTAGCG 
CCATGCCCTC 
ATTTTGAATG 
TCGCTATACT 
GGGCTCTAGA 
CCGTTCTTCC 
CCGCCAACAA 
ATGGAGACCT 
ATCTCCGAAG 
TGATGTGTAT 
GAGCCTCTTG 
GCAAGTGCTG 
GGAAGCTGCT 
GTGGGCTAAG 



TTTTCAAATA 
TGCAATTGCC 
TCACGTTCCT 
CGGAATGCTA 
AGCCCAAGGC 
CAGGAGCTTC 
TTAGATTATT 
TGAAGCGAAA 
CTCTAGCCTC 
AATGGTTTCA 
AAAGCTCAAA 
ATCTTTCTGC 
GCCAACAAAA 
CGGCAATTTT 
TGGTGGTATT 
CTTCTCACTT 
ACATTGGGAA 
TGACTTGGGT 
GTAGTTCCCG 
GCAGATCATG 
GGGCATCAGG 
CAAGCATTCC 
AGATTTTTCT 
AAGATAAAGA 
TCGCTATACC 
ATTTGGAAGT 
CCCAGCTAAG 
TCCTATCCTG 
GCTCGGAGGA 
AGGGATATTT 
AACTTTAAAG 
AGTGAACTGC 
ATGAAAAAAA 
CGTAAAAATC 
GACTTCGCTA 
GAAGCCATCT 
TATGAGCTTC 
ATACTCATTC 



ATAGATGCGG 
GACTCTGGAT 
TGGCAACACT 
TCTACCTGGG 
CAATCTATCT 
AGACGTTCTG 
CAGGAACGAT 
GCTGCTGATA 
TGGAACCTTA 
CACAGACTGA 
GCAGATACTG 
CTTAGAGGGA 
CTATAACTCT 
TATGAAAGCC 
CACTGCTGCT 
CTCCAGTACA 
GCCACTTGGG 
AACTACGGGC 
ATTCATTATG 
ACATCTCAAG 
AACTGCGAAT 
GACATAAAAG 
GAAAATATCT 
CCTGTTTATA 
TGCAACATCG 
ATCACCGACA 
CTACAGCTAC 
AAGCTCAAGG 
TCTCTGGCTC 
CCCCTTCTTA 
AGAGTGGCGC 
TCTATCCCTG 
TAATTTCGAG 
CCCGTTCGCG 
TGTAAAAACC 
GACTCTCTCC 
GTGGCTCAGC 
TAG 



The PSORT algorithm predicts outer membrane (0.927). 
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The protein was expressed in Kcoli and purified as a GST-fusion product, as shown in Figure 23A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 23B) and for FACS analysis (Figure 23C). A his-tag protein was also expressed. 

The cp6729 protein was also identified in the 2D-PAGE experiment (Cpn0446) and showed good 
cross-reactivity with human sera, including sera from patients with pneumonitis. 

These experiments show that cp6729 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 24 

The following C.pneumoniae protein (pid 4376849) was expressed <SEQ ID 47; cp6849>: 
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1 MSKLIRRWT VLALTSMfi.SC 



51 MTAKKVRLVR 

101 LYSVKVNDDC 

151 CEAEFVSSDP 

201 ATVCACPELR 

251 RWTVDNPVP 

301 NVATVTYCGG 

351 PGDLVLHDW 



RNKQPVEQKS 
NVEICQSVPE 
ETTPTSDGKL 
SYTKCGQPAI 
DGYSHASGQR 
HKCSAMVTTV 
IQDTLPSGVT 



FASGGIEAAV 
RGAFCDKEPY 
YATVGSPYPI 
VWKIDRLGAG 
CIKQEGPDCA 
VLSFNIiGDMR 
VNEPCVQVNI 
VLEAPGGEIC 



AESLITKIVA 
PCEEGRCQPV 
EILAIGKKDC 
DKCKITVWVK 
CLRCPVCYKI 
PGDKKVFTVE 
SGADWSYVCK 
CNKWWRIKE 



SAETKPAPVP 
EAQQESCYGR 
VDWITQQLP 
PLKEGCCFTA 
EWNTGSAIA 
FCPQRRGQIT 
PVEYSISVSN 
MCPGETLQFK 



BNSDOCID: <WO 0202606A2J„> 
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401 LWKAQVPGR FTNQVAVTSE SNCGTCTSCA ETTTHWKGIA ATHMCVLDTN 

451 DPICVGENTV YRICVTNRGS AEDTNVSLIL KFSKEIiQPIA SSGPTKGTIS 

501 GNTWFDAIiP KLGSKESVEF SVTLKGIAPG DARGEAILSS DTLTSPVSDT 

551 ENTHVY* 



A predicted signal peptide is highlighted. 

The cp6849 nucleotide sequence <SEQ ID 48> is: 
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1 


ATvj 1 LLAAAL 


j! 




i m 
1U1 


1 GA rTACTAA 




ATCaACAGCGA 


201 


ACAAAAAAGC 


zbl 


AGGGACGATG 


301 


TTGTATTCTG 


351 


CGTTCCAGAA 


401 


CTATAGGCAA 


451 


TGCGAAGCTG 


501 


TGGGAAATTA 


551 


AAATTACTGT 


601 


GCTACTGTAT 


651 


ACCAGCCATT 


701 


GCCCTGTATG 


751 


CGTAACGTAA 


801 


TGGTCAAAGA 


851 


AAAAGGTATT 


901 


AACGTTGCTA 


951 


AACTACAGTO 


1001 


ATTGGTCTTA 


1051 


CCTGGAGACT 


1101 


TGGTGTTACA 


1151 


TTGTTTGGCG 


1201 


CTTGTAGTGA 


1251 


AACTAGTGAG 


1301 


CACATTGGAA 


1351 


GATCCTATCT 


1401 


CCGTGGTTCT 


1451 


AAGAACTTCA 


1501 


GGTAATACCG 


1551 


TGTAGAGTTT 


1601 


GCGAAGCTAT 


1651 


GAAAATACCC 



TCATCAGACG 
TTTGCCAGCG 
GATCGTCGCT 
AGAAGGTTAG 
CGTGGTGCTT 
TCAACCTGTA 
TAAAAGTAAA 
TACGCTACTG 
AAAAGATTGT 
AATTCGTAAG 
GTCTGGAAAA 
ATGGGTAAAA 
GTGCTTGCCC 
TGTATTAAGC 
CTACAAA \TC 
CTGTAGATAA 
GTTCTCTCTT 
TACAGTTGAG 
CTGTAACTTA 
GTTAATGAGC 
CGTATGTAAA 
TGGTTCTTCA 
GTACTCGAAG 
TATTAAAGAA 
AAGCTCAAGT 
TCTAACTGCG 
AGGTCTTGCA 
GTGTAGGAGA 
GCTGAAGATA 
GCCAATAGCT 
TTGTTTTCGA 
TCTGOTACCT 
TCTTTCTTCT 
ACGTGTATTA 



AGTAGTTACG 
GGGGTATAGA 
AGTGCGGAAA 
ACTTGTCCGT 
TTTGTGATAA 
GAGGCTCAGC 
CGATGATTGC 
TAGGATCTCC 
GTTGATGTTG 
CAGTGATCCA 
TCGATCGCCT 
CCTCTTAAAG 
AGAGCTCCGT 
AAGAAGGACC 
GAAGTAGTGA 
TCCTGTTCCC 
TTAACTTAGG 
TTCTGCCCTC 
CTGCGGTGGA 
CTTGTGTACA 
CCTGTGGAGT 
TGATGTCGTG 
CTCCTGGTGG 
ATGTGCCCAG 
TCCTGGAAGA 
GAACATGTAC 
GCTACCCATA 
AAATACTGTC 
CTAACGTATC 
TCTTCAGGTC 
CGCTTTACCT 
TGAAAGGTAT 
GATACACTGA 
A 



GTCCTTGCGC 
GGCCGCTGTA 
CAAAGCCAGC 
AGAAATAAAC 
AGAATTTTAT 
AAGAGTCTTG 
AACGTAGAAA 
TTACCCTATT 
TGATTACACA 
GAAACAACTC 
GGGTGCAGGA 
AAGGTTGCTG 
TCTTATACTA 
TGACTGTGCT 
ACACAGGATC 
GATGGCTATT 
AGACATGAGA 
AAAGAAGAGG 
CACAAATGTT 
AGTAAATATC 
ACTCTATCTC 
ATCCAAGATA 
AGAGATCTGC 
GAGAAACCCT 
TTCACAAATC 
ATCTTGCGCA 
TGTGCGTATT 
TATCGTATCT 
TTTAATCO?TG 
CAACTAAAGG 
AAACTCGGTT 
TGCTCCCGGA 
CTTCACCAGT 



TAACGAGTAT 
GCAGAGTCTC 
ACCTGTTCCT 
AACCAGTTGA 
CCCTGTGAAG 
CTACGGAAGA 
TTTGCCAGTC 
GAAATCCTTG 
ACAGCTACCT 
CTACAAGTGA 
GATAAATGCA 
CTTCACAGCT 
AATGCGGTCA 
TGCCTAAGAT 
TGCTATTGCC 
CTCATGCATC 
CCTGGCGATA 
TCAAATCACT 
CTGCAAATGT 
TCTGGTGCTG 
AGTATCGAAT 
CACTCCCTTC 
TGTAATAAAG 
CCAGTTTAAA 
AAGTTGCAGT 
GAAACAACAA 
AGACACAAAT 
GTGTAACTAA 
AAGTTCTCAA 
AACGATTTCA 
CTAAGGAATC 
GATGCTCGCG 
ATCAGACACA 



45 



The PSORT algorithm predicts periplasmic space (0.93). 

The protein was expressed in Kcoli and purified as a GST-fusion product, as shown in Figure 24 A, 
and also as a his-tag protein. The recombinant proteins were used to immunise mice, whose sera 
were used in a Western blot (Figure 24B) and for FACS analysis (Figure 24C). 

The cp6849 protein was also identified in the 2D-PAGE experiment (Cpn0557). 

These experiments show that cp6849 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 25 

The following C.pneumoniae protein (pid 4376273) was expressed <SEQ ID 49; cp6273>: 



50 



1 MGtiFHIiTItFG I*I,LCSLP3:SIi VAKFPESVGH_ 

51 LDAYGDHDFF VLRKIGEDYL KQSIHSSDPQ" 

101 LSQAMETADP LQQLLVLSAV SGHLGKTSDD 

151 LANLKNTKVI DHLHSFIHKL PEEIQCLSAA 

2 01 KKSAIRSATA LQIGEYQQKR FLPTLRNLLT 



KILYISTQST QQALATYLEA 



TRKSTIIGAG 
LIiFKALASPY 
IFLRLETEES 
SASPQDQEAI 



LAGSSEAX.DV 
PVIRLEAAYR 
DAYIRDIiIAA 
LYALGKDKDG 
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251 QSYYNIKKQL QKPDVDVTLA AAQALIALGK EEDALPVIKK QALEERPRAL 

301 YALRHLPSEI GIPIALPIFL KTKNSEAKLN VALALLELGC DTPKL.LEYIT 

351 ERLVQPHYNE TLALSFSKGR TLQUWKRVNI IVPQDPQERE RLLSTTRGLE 

401 EQILTFLFRL PKEAYLPCIY KLLASQKTQIi ATTAISFLSH TSHQEALDLL 

5 451 FQAAKLPGEP IIRAYADI*AI YNLTKDPEKK RSLHDYAKKL IQETLLFVDT 

501 ENQRPHPSMF YiLRYQVTPES RTKLMLDILE TLATSKSSED IRLLXQLMTE 

551 GDAKNFPVLA GLLIKIVE* 

A predicted signal peptide is highlighted. 

The cp6273 nucleotide sequence <SEQ ID 50> is: 

10 X ATGGGAC TAT TCCATCTAAC TCTCTTTGGA CTTTTATTGT GTAGTCTTCC 

51 CATTTCTCTT GTTGCTAAAT TCCCTGAGTC TGTAGGTCAT AAGATCCTTT 

101 ATATAAGTAC GCAATCTACA CAGCAGGCCT TAGCAACATA TCTGGAAGCT 

151 CTAGATGCCT ACGGTGATCA TGACTTCTTC GTTTTAAGAA AAATCGGAGA 

201 AGACTATCTC AAGCAAAGCA TCCACTCCTC AGATCCGCAA ACTAGAAAAA 

15 251 GCACCATCAT TGGAGCAGGC CTGGCGGGAT CTTCAGAAGC CTTGGACGTG 

301 CTCTCCCAAG CTATGGAAAC TGCAGACCCC CTGCAGCAGC TACTGGTTTT 

351 ATCGGCAGTC TCAGGACATC TTGGGAAAAC TTCTGACGAC TTACTGTOTA 

401 AAGCTTTAGC ATCTCCCTAT CCTGTCATCC GCTTAGAAGC CGCCTATAGA 

451 CTTGCTAATT TGAAGAACAC TAAAGTCATT GATCATCTAC ATTCTTTCAT 

20 501 TCATAAGCTT CCCGAAGAAA TCCAATGCCT ATCTGCGGCA ATATTCCTAC 

551 GCTTGGAGAC TGAAGAATCT GATGCTTATA TTCGGGATCT CTTAGCTGCC 

601 AAGAAAAGCG CGATTCGGAG TGCCACAGCT TTGCAGATCG GAGAATACCA 

651 ACAAAAACGC TTTCTTCCGA CACTTAGGAA TTTGCTAACG AGTGCGTCTC 

7 01 CTCAAGATCA AGAAGCTATT CTTTATGCTT TAGGGAAGCT TAAGGATGGT 

25 751 CAGAGCTACT ACAATATAAA AAAGCAATTG CAGAAGCCTG ATGTGGATGT 

801 CACTTTAGCA GCAGCTCAAG CTTTAATTGC TTTGGGGAAA GAAGAGGACG 

851 CTCTTCCCGT GATAAAAAAG CAAGCACTTG AGGAGCGGCC TCGAGCCCTG 

901 TATGCCTTAC GGCATCTACC CTCTGAGATA GGGATTCCGA TTGCCCTGCC 

951 GATATTCCTA AAAACTAAGA ACAGCGAAGC CAAGTTGAAT GTAGCTTTAG 

30 1001 CTCTCTTAGA GTTAGGGTGT GACACCCCTA AACTACTGGA ATACATTACC 

1051 GAAAGGCTTG TCCAACCACA TTATAATGAG ACTCTAGCCT TGAGTTTCTC 

1101 TAAGGGGCGT ACTTTACAAA ATTGGAAGCG GGTGAACATC ATAGTCCCTC 

1151 AAGATCCCCA GGAGAGGGAA AGGTTGCTCT CCACAACCCG AGGTCTTGAA 

1201 GAGCAGATCC TTACGTTTCT CTTCCGCCTA CCTAAAGAAG CTTACCTCCC 

35 12 51 CTGTATTTAT AAGCTTTTGG CGAGTCAGAA AACTCAGCTT GCCACTACTG 

1301 CGATTTCTTT TTTAAGTCAC ACCTCACATC AGGAAGCCTT AGATCTACTT 

1351 TTCCAAGCTG CGAAGCTTCC TGGAGAACCT ATCATCCGCG CCTATGCAGA 

1401 TCTTGCTATT TATAATCTCA CCAAAGATCC TGAAAAAAAA CGTTCTCTCC 

1451 ATGATTATGC AAAAAAGCTA ATTCAGGAAA CCTTGTTATT TGTGGACACG 

40 1501 GAAAACCAAA GACCCCATCC CAGCATGCCC TATCTACGTT ATCAGGTCAC 

1551 CCCAGAAAGC CGTACGAAGC TCATGTTGGA TATTCTAGAG ACACTAGCCA 

1601 CCTCGAAGTC TTCCGAAGAT ATCCGTTTAT TGATACAACT GATGACGGAA 

1651 GGAGATGCAA AAAATTTCCC AGTCCTTGCA GGCTTACTCA TAAAAATTGT 

1701 GGAGTAA 

45 The PSORT algorithm predicts a periplasmic location (0,922), 

The protein was expressed in Kcoli and purified as a his-tag product and as a GST-fusion product, as 
shown in Figure 25A. The recombinant GST-fusion was used to immunise mice, whose sera were 
used in a Western blot (Figure 25B) and for FACS analysis (Figure 25C). 

This protein also showed good cross-reactivity with human sera, including sera from patients with 
50 pneumonitis. 

These experiments show that cp6273 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 26 

The following ^pneumoniae protein (pid 4376735) was expressed <SEQ ID 51; cp6735>: 
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1 MTILRNFLTC SALFLAI<3PAA AQWYLHESD GYNGAINNKS LEPKITCYPE 

51 GTSYIFLDDV RISNVKHDQE DAGVF INRSG NLFFMGNRCN FTFHNLMTEG 

101 FGAAISNRVG DTTLTLSNFS YLAFTSAPLL PQGQGAIYSL GSVMIENSEE 

151 VTFCGNYSSW SGAAIYTPY1. LGSKASRPSV NLSGNRYLVF RDNVSQGYGG 

5 201 AISTHNLTLT TRGPSCFENN HAYHDVNSNG GAIAIAPGGS ISISVKSGDLj 

251 IFKGNTASQD GNTIHNSIHIi QSGAQFKNLR AVSESGVYFY DPISHSESHK 

3 01 ITDLVINAPE GKETYEGTIS FSGIiCLDDHE VCAENLTSTI LQDVTLAGGT 

3 51 LSLSDGVTLQ LHSFKQEASS TI/EMSPGTTL LCSGDARVQN LHXLIEDTDN 

4 01 FVPVRIRAED KDALVSLEKL KVAFEAYWSV YDFPQFKEAF TIPLLELLGP 
10 451 SFDSLLLGET TLERTQVTTE NDAVRGFWSL SWEEYPPSLD KDRR1TPTKK 

501 TVFLTWNPEI TSTP* 



A predicted signal peptide is highlighted. 
The cp6735 nucleotide sequence <SEQ ID 52> is: 



1 ATGACCATAC TTCGAAATTT TCTTACCTGC TCGGCTTTAT TCCTCGCTCT 

15 51 CCCTGCAGCA GCACAAGTTG TATATCTTCA TGAAAGTGAT GGTTATAACG 

101 GTGCTATCAA TAATAAAAGC TTAGAACCTA AAATTACCTG TTATCCAGAA 

151 GGAACTTCTT ACATCTTTCT AGATGACGTG AGGATTTCCA ACGTTAAGCA 

201 TGATCAAGAA GATGCTGGGG TTTTTATAAA TCGATCTGGG AATCTTTTTT 

251 TCATGGGCAA CCGTTGCAAC TTCACTTTTC ACAACCTTAT GACCGAGGGT 

20 301 TTTGGCGCTG CCATTTCGAA CCGCGTTGGA GACACCACTC TCACTCTCTC 

351 TAATTTTTCT TACTTAGCGT TCACCTCAGC ACCTCTACTA CCTCAAGGAC 

401 AAGGAGCGAT TTATAGTCTT GGTTCCGTGA TGATCGAAAA TAGTGAGGAA 

451 GTGACTTTCT GTGGGAACTA CTCTTCGTGG AGTGGAGCTG CGATTTATAC 

501 TCCCTACCTT TTAGGTTCTA AGGCGAGTCG TCCTTCAGTA AATCTCAGCG 

25 551 GGAACCGCTA CCTGGTGTTT AGAGACAATG TGAGCCAAGG TTATGGCGGC 

601 GCCATATCTA CCCACAATCT CACACTCACG ACTCGAGGAC CTTCGTGTTT 

651 TGAAAATAAT CATGCTTATC ATGACGTGAA TAGTAATGGA GGAGCCATTG 

701 CCATTGCTCC TGGAGGATCG ATCTCTATAT CCGTGAAAAG CGGAGATCTC 

751 ATCTTCAAAG GAAATACAGC ATCACAAGAC GGAAATACAA TACACAACTC 

30 801 CATCCATCTG CAATCTGGAG CACAGTTTAA GAACCTACGT GCTGTTTCAG 

851 AATCCGGAGT TTATTTCTAT GATCCTATAA GCCATAGCGA GTCGCATAAA 

901 ATTACAGATC TTGTAATCAA TGCTCCTGAA GGAAAGGAAA CTTATGAAGG 

951 AACAATTAGC TTCTCAGGAC TATGCCTGGA TGATCATGAA GTTTGTGCGG 

1001 AAAATC TTAC TTCCACAATC CTACAAGATG TCACATTAGC AGGAGGAACT 

35 1051 CTCTCTCTAT CGGATGGGGT TACCTTGCAA CTGCATTCTT TTAAGCAGGA 

1101 AGCAAGCTCT ACGCTTACTA TGTCTCCAGG AACCACTCTG CTCTGCTCAG 

1151 GAGATGCTCG GGTTCAGAAT CTGCACATCC TGATTGAAGA TACCGACAAC 

1201 TTTGTTCCTG TAAGGATTCG CGCCGAGGAC AAGGATGCTC TTGTCTCATT 

1251 AGAAAAACTT AAAGTTGCCT TTGAGGCTTA TTGGTCCGTC TATGACTTTC 

40 1301 CTCAATTTAA GGAAGCCTTT ACGATTCCTC TTCTTGAACT TCTAGGGCCT 

1351 TCTTTTGACA GTCTTCTCCT AGGGGAGACC ACTTTGGAGA GAACCCAAGT 

1401 CACAACAGAG AATGACGCCG TTCGAGGTTT CTGGTCCCTA AGCTGGGAAG 

1451 AGTACCCCCC TTCTCTGGAT AAAGACAGAA GGATCACACC AACTAAGAAA 

1501 ACTGTTTTCC TCACTTGGAA TCCTGAGATC ACTTCTACGC CATAA 

45 The PSORT algorithm predicts an outer membrane location (0.922). 

The protein was expressed in Exoli and purified as a as a his-tag product and as a GST-fusion 
product, as shown in Figure 26A. The recombinant GST-fusion protein was used to immunise mice, 
whose sera were used in a Western blot (Figure 26B). 

These experiments show that cp6735 is a surface-exposed and immunoaccessible protein, and that it 
50 is a useful immunogen. These properties are not evident from the sequence alone. 

Example 27 

The following ^pneumoniae protein (pid 4376784) was expressed <SEQ ED 53; cp6784>: 

1 MNRRKARWW AIiFAMTAX»IS VGCCPWSQAK SRCSIDKYIP WNRLLEVCG 

51 LPEAENVEDL IESSSAWVLT PEERFSGELV SICQVKDEKA FYNDLSIjLHM 

55 101 TQAVPSYSAT YDCAWFGGP LPALRORLDF LVREWQRGVR PKKlVFIiCGE 

151 RGRYQSIEEQ EHFFDSRYNP FPTEENWESG NRVTPSSEEE IAKFVWMQML 



BNSDOC1D: <WO 02O2606A2_l_> 
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201 LPRAWRDSTS GVRVTFLLAK PEENRWANR KDTLLLFRSY QEAFPGRVLF 
251 VSSQPFIGLD ACRVGQFFKG ESYDLAGPGF AQGVLKYHWA PRlCLHTLAE 
301 WLKETNGCLN ISEGCFG* 



A predicted signal peptide is highlighted. 
The cp6784 nucleotide sequence <SEQ ID 54> is: 



1 ATGAATAGAA GAAAAGCAAG ATGGGTAGTG GCATTGTTCG CAATGACGGC 

51 GCTCATTTCT GTTGGGTGTT GTCCTTGGTC ACAAGCGAAA TCAAGATGTT 

101 CTATTGATAA GTATATTCCT GTAGTCAATC GTTTACTAGA AGTTTGTGGA 

151 CTTCCTGAAG CTGAGAATGT TGAGGATTTA ATCGAGTCCT CGTCTGCTTG 

10 201 GGTACTGACT CCTGAAGAAC GTTTTTCTGG AGAGTTAGTC TCTATCTGTC 

251 AGGTTAAAGA TGAGCATGCT TTCTATAACG ATTTGTCTTT ATTACATATG 

301 ACTCAGGCTG TGCCTTCGTA TTCTGCAACG TATGATTGTG CTGTAGTTTT 

3 51 TGGCGGGCCT TTGCCAGCGC TACGTCAGCG CTTAGATTTT TTGGTGCGAG 

401 AGTGGCAGCG TGGCGTGCGC TTTAAGAAAA TCGTTTTTCT ATGTGGAGAG 

15 451 CGAGGGCGCT ATCAGTCTAT TGAAGAACAA GAGCATTTCT TTGATTCTCG 

501 GTACAATCCT TTCCCTACTG AAGAGAACTG GGAATCTGGT AACCGAGTTA 

551 CTCCCTCTTC TGAAGAAGAG ATTGCCAAAT TTGTTTGGAT GCAAATGCTT 

601 TTACCTAGAG CATGGCGAGA TAGTACTTCA GGAGTCAGAG TGACATTTCT 

651 TCTAGCAAAG CCAGAGGAAA ATCGTGTGGT TGCGAATCGT AAGGACACCT 

20 701 TACTTTTATT CCGTTCTTAT CAAGAAGCGT TTCCGGGACG CGTGTTATTT 

751 GTAAGTAGTC AACCCTTTAT CGGTTTAGAT GCTTGCAGGG TCGGGCAGTT 

801 TTTCAAAGGG GAAAGCTATG ATCTTGCTGG ACCTGGATTT GCTCAAGGAG 

851 TCTTGAAGTA TCATTGGGCT CCAAGGATTT GTCTACATAC TTTAGCGGAA 

901 TGGTTAAAGG AAACGAACGG CTGCTTAAAT ATTTCAGAGG GTTGTTTTGG 

25 951 ATGA 

The PSORT algorithm predicts a periplasmic location (0.894). 

The protein was expressed in Kcoli and purified as a his-tag product and as a GST-fusion product, as 
shown in Figure 27 A. The recombinant proteins were used to immunise mice, whose sera were used 
in a Western blot (Figure 27B). The GST-fusion product was used for FACS analysis (Figure 27C). 

30 The cp6784 protein was also identified in the 2D-PAGE experiment (Cpn0498). 

These experiments show that cp6784 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogeri. These properties are not evident from the sequence alone. 

Example 28 

The following C.pneumoniae protein (pid 437 6960) was expressed <SEQ ID 55; cp6960>: 

35 1 MNRRWNLVLA TV&LALSVAS CDVRSK PKDK DQGSLVEYKD NKDTNDIELS 

51 DNQKLSRTFG HLLARQLRKS EDMFFDIAEV AKGLQAELVC KSAPLTETEY 

101 EEKMAEVQKL* VFEKKSKENL SLAEKFLKEN SKNAGWEVQ PSKLQYKIIK 

151 EGAGKAISGK PSALLHYKGS FINGQVFSSS EGNNEPILbP LGQTIPGFAL 

201 GMQGMKEGET RVLYIHPDLA YGTAGQLPPN SLLIFEINLI QASADEVAAV 

40 251 PQEGNQGE* 

A predicted signal peptide is highlighted. 

The cp6960 nucleotide sequence <SEQ ID 56> is: 

1 ATGAACAGAC GGTGGAATTT AGTTTTAGCA ACAGTAGCTC TGGCACTCTC 

51 CGTCGCTTCT TGTGACGTAC GGTCTAAGGA TAAAGACAAG GATCAGGGGT 

45 101 CGTTAGTGGA ATATAAAGAT AACAAAGATA CCAATGACAT AGAATTATCC 

151 GATAATCAAA AGTTATCCAG AACATTTGGT CATTTATTAG CACGCCAATT 

201 ACGCAAGTCA GAAGATATGT TTTTTGATAT TGCAGAAGTG GCTAAGGGGT 

251 TGCAGGCGGA ATTGGTTTGT AAAAGTGCTC CTTTAACAGA AACAGAGTAT 

3 01 GAAGAAAAAA TGGCTGAAGT ACAGAAGTTG GTTTTTGAAA AAAAATCAAA 

50 351 AGAAAATCTT TCATTGGCAG AAAAATTCTT AAAAGAAAAT AGCAAGAACG 

401 CTGGTGTTGO 1 TGAAGTGCAA CCAAGTAAAT TGCAATACAA AATTATTAAA 
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451 GAAGGTGCAG GGAAAGCAAT TTCAGGTAAA CCTTCAGCTC TATTGCACTA 

502 CAAGGGTTCC TTCATCAATG GCCAAGTATT TAGCAGTTCA GAAGGCAACA 

551 ATGAGCCTAT CTTGCTTCCT CTAGGCCAAA CAATTCCTGG TTTTGCTTTA 

601 GGTATGCAGG GCATGAAAGA AGGAGAAACT CGAGTTCTCT ACATCCATCC 

5 651 TGATCTTGCT TACGGAACCG CAGGACAACT TCCTCCAAAC TCTTTATTAA 

701 TTTTTGAAAT TAACTTGATT CAGGCTTCAG CAGATGAAGT TGCTGCTGTA 

751 CCCCAAGAAG GAAATCAAGG TGAATGA 

The PSORT algorithm predicts periplasmic space location (0.930). 

The protein was expressed in Kcoli and purified as a his-tag product and as a GST-fusion product, as 
10 shown in Figure 28A. The recombinant proteins were used to immunise mice, whose sera were used 
in a Western blot (Figure 28B) and for FACS analysis (Figure 28C). 

The cp6960 protein was also identified in the 2D-PAGE experiment. 

These experiments show that cp6960 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

15 Example 29 

The following C.pneumoniae protein (pid 4376968) was expressed <SEQ ID 57; cp6968>: 

1 MKFLLYVPIX I.VLVSTGC DA KPVSFEPFSG KLSTQRFEPQ HSAEEYFSQG 

51 QEFLKKGNFR KALLCFGIIT HHFPRDILRN QAQYLIGVCY FTQDHPDLAD 

101 KAFASYliQLP DAEYSEELPQ MKYAIAQRFA QGKRKRICRL EGFPKIdMNAD 

20 151 EDALRIYDEI LTAFFSKDLG AQALYSKAAL L I VKNDLTEA TKTLKKLTLQ 

201 FPIjHIIjSSEA FVRLjSEIYLQ QAKKEPHNLQ YL.HFAKLNEE AMKKQHPNHP 

251 LNEWSANVG AMREHYARGL YATGRFYEKK KKAEAANIYY RTAITMYPDT 

301 IiLVAKCQKRl. DRISKHTS* 

A predicted signal peptide is highlighted. 
25 The cp6968 nucleotide sequence <SEQ ID 58> is; 

1 ATGAAATTTC TATTATACGT TCCACTTCTT CTTGTTCTCG TATCTACGGG 

51 GTGCGATGCA AAACCTGTTT CTTTTGAGCC CTTTTCAGGA AAGCTTTCCA 

101 CCCAGCGTTT TGAGCCTCAG CACTCTGCTG AAGAATATTT TTCTCAGGGA 

151 CAGGAATTCT TAAAAAAAGG AAATTTCAGA AAAGCTTTAC TATGCTTTGG 

30 201 AATCATTACG CATCACTTCC CTAGGGACAT CTTGCGTAAT CAAGCACAGT 

251 ATCTTATAGG AGTCTGTTAC TTCACGCAGG ATCACCCAGA TTTAGCAGAC 

301 AAGGCATTTC CATCTTACTT ACAACTTCCT GATGCGGAGT ACTCTGAAGA 

351 GTTGTTCCAG ATGAAATATG CGATTGCTCA AAGATTTGCT CAAGGGAAGC 

401 GTAAACGGAT TTGTCGATTA GAGGGCTTCC CAAAACTAAT GAATGCTGAT 

35 451 GAAGATGCGC TACGCATTTA TGACGAGATT CTAACAGCGT TTCCTAGTAA 

501 AGACTTAGGA GCTCAGGCCC TCTATAGTAA AGCTGCGTTA CTTATTGTAA 

551 AAAACGATCT TACAGAAGCC ACCAAAACCT TAAAAAAACT CACGTTACAA 

601 TTTCCTCTAC ATATTTTATC TTCAGAGGCC TTTGTACGTT TATCGGAAAT 

651 CTATTTACAG CAAGCTAAGA AAGAGCCTCA CAATCTTCAA TATCTTCATT 

40 701 TTGCAAAGCT TAATGAAGAG GCAATGAAAA AGCAGCATCC TAACCATCCT 

751 G TGAATGAGG TTGTTTCTGC TAATGTTGGA GCTATGCGGG AACATTATGC 

801 TCGAGGTTTG TATGCCACAG GTCGTTTCTA TGAGAAGAAG AAAAAAGCCG 

851 AGGCTGCGAA TATCTATTAC CGCACTGCGA TTACAAACTA CCCAGACACT 

901 TTATTAGTGG CTAAATGTCA AAAGCGTCTA GATAGAATAT CTAAGCATAC 

45 951 TTCCTAA 

The PSORT algorithm predicts an inner membrane location (0.790). 

The protein was expressed in Kcoli and purified as a his-tag product and as a GST-fusion product, as 
shown in Figure 29A. The recombinant GST-fusion was used to immunise mice, whose sera were 
used in a Western blot (Figure 29B) and for FACS analysis (Figure 29C). 
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This protein also showed good cross-reactivity with human sera, including sera from patients with 
pneumonitis. 

These experiments show that cp6968 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

5 Example 30 

The following C.pneumoniae protein (pid 4376998) was expressed <SEQ ID 59; cp6998>: 

1 MKKLLKSAIiIi SAAFAGSVGS LQ&L PVGNPS DPSLLIDGTI WEGAAGDPCD 

51 PCATWCDAIS LRAGFYGDYV FDRILKVDAP KTFSMGAKPT GSAAANYTTA 

101 VDRPNPAYNK HL.HDAEWFTN AGFIALNIWD RFDVFCTLGA SNGY1RGNST 

151 AFNIjVGLFGV KGTTVNANEL PNVSLSNGW ELYTDTSFSW SVGARGAXjWE 

201 CGCATLGAEF QYAQSKPKVE ELNVICNVSQ FSVNKPKGYK GVAFPLPTDA 

251 GVATATGTKS atinyhewqv gaslsyrlns lvpyigvqws ratfdadnir 

301 IAQPKLPTAV LMW»AWNPSIi LGNATALSTT DSFSDFMQXV SCQINKFKSR 
351 KACGVTVGAT LVDADKWSLT AEARLINERA AHVSGQFRF* 

A predicted signal peptide is highlighted. 



10 



15 



The cp6998 nucleotide sequence <SEQ ID 60>is: 



20 



25 



30 



35 



40 



45 



1 


ATGAAAAAAC 


51 


TGTTGGCTCC 


101 


TATTAATTGA 


151 


CCTTGCGCTA 


201 


AGACTATGTT 


251 


CTATGGGAGC 


301 


GTAGATAGAC 


351 


GTTCACTAAT 


401 


TTTTCTGTAC 


451 


GCGTTCAATC 


501 


AAATGAACTA 


-551 


CAGACACCTC 


601 


TGCGGTTGTG 


651 


TAAAGTTGAA 


701 


ACAAACCCAA 


751 


GGCGTAGCAA 


801 


ATGGCAAGTA 


851 


ACATTGGAGT 


901 


ATTGCTCAGC 


951 


CCCTTCTTTA 


1001 


CAGACTTCAT 


1051 


AAAGCTTGTG 


1101 


GTCACTTACT 


1151 


CTGGTCAGTT 



TCTTAAAGTC 
TTACAAGCCT 
TGGTACAATA 
CTTGGTGCGA 
TTCGACCGTA 
CAAGCCTACT 
CTAACCCGGC 
GCAGGCTTCA 
TTTAGGAGCT 
TCGTTGGTTT 
C C AAACGTTT 
TTTCTCTTGG 
CAACTTTGGG 
GAACTTAATG 
GGGCTATAAA 
CAGCTACTGG 
GGAGCCTCTC 
ACAATGGTCT 
CAAAACTACC 
CTAGGAAATG 
GCAAATTGTT 
GAGTTACTGT 
GCAGAAGCTC 
CAGATTCTAA 



GGCGTTATTA 
TGCCTGTAGG 
TGGGAAGGTG 
CGCTATTAGC 
TCTTAAAAGT 
GGATCCGCTG 
CTACAATAAG 
TTGCCTTAAA 
TCTAATGGTT 
ATTCGGAGTT 
CTTTAAGTAA 
AGCGTAGGCG 
AGCTGAATTC 
TGATCTGTAA 
GGCGTTGCTT 
AACAAAGTCT 
TATCTTACAG 
CGAGCAACTT 
TACAGCTGTT 
CCACAGCATT 
TCCTGTCAGA 
AGGAGCTACT 
GTTTAATTAA 



TCCGCCGCAT 
GAACCCTTCT 
CTGCAGGAGA 
TTACGTGCTG 
AGATGCACCT 
CTGCAAACTA 
CATTTACACG 
CATTTGGGAT 
ACATTAGAGG 
AAAGGTACTA 
CGGAGTTGTT 
CTCGTGGAGC 
CAATATGCAC 
CGTATCGCAA 
TCCCCTTGCC 
GCGACCATCA 
ACTAAACTCT 
TOGATGCTGA 
TTAAACTTAA 
GTCTACTACT 
TCAACAAGTT 
TTAGTTGATG 
CGAGAGAGCT 



TTGCTGGTTC 
GATCCAAGCT 
TCCTTGCGAT 
GATTTTACGG 
AAAACATTTT 
TACTACTGCC 
ATGCAGAGTG 
CGCTTTGATG 
AAACTCTACA 
CTGTAAATGC 
GAACTTTACA 
CTTATGGGAA 
AGTCCAAACC 
TTCTCTGTAA 
AACAGACGCT 
ATTATCATGA 
TTAGTGCCAT 
TAACATCCGC 
CTGCATGGAA 
GATTCGTTCT 
TAAATCTAGA 
CTGATAAATG 
GCTCACGTAT 



The PSORT algorithm predicts an outer membrane location (0.707). 

The protein was expressed in Kcoli and purified as a GST-fusion (Figure 30A) and as a his-tag 
product. The recombinant GST-fusion protein was used to immunise mice, whose sera were used in a 
Western blot (Figure 30B) and for FACS analysis (Figure 30C). 

The cp6998 protein was also identified in the 2D-PAGE experiment (Cpn0695) and showed good 
cross-reactivity with human sera, including sera from patients with pneumonitis. 

These experiments show that cp6998 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 
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Example 31 

The following C.pneumoniae protein (pid 4377102) was expressed <SEQ ID 61; cp7102>: 

1 MKHTFTKR VIt FFPFXVIPIP LLLNIWVGF FSFSA AKANL VQVLHTRATN 

51 IjSIEFEKKLiT ihklfldrla ntlalksyas psaepyaqay nemmalsntd 

101 FSLCLIDPFD GSVRTKNPGD PFIRYLKQHP EMKKKLSAAV GKAFLLTIPG 

151 KPLLHYTj ILV EDVASWDSTT TSGLLVSFYP 3MSFLQKDLFO SLHITKGNIC 

201 LVNKYGEVLF CAQDSESSFV FSIiDLPNLPQ FQARSFSAIE IEKASGILGG 

251 ENLITVSINK KRYLGLVLNK IPIQGTYTLS LVPVSDLIQS ALKVPLNICF 

ln 301 FVVLAFIiLMW WIFSKINTKL NKPLQELTFC MEAAWRGNHN VRFEPQPYGY 

1U 351 EFNELGNIFN CTLLLLLNSI EKADIDYHSG EKLQKELGIL SShQSKLhSP 

401 DFPTFPKVTF SSQHLRRRQL SGHFNGWTVQ DGGDTLLGI I GLAGDIGLPS 

451 YLYALSARSL FIAYASSDVS LQKISKDTAD SFSKTTEGNE AWAMTF IKY 

501 VEKDRSLELL SliSEGAPTMF LQRGESFVKL PLETHQALQP GDRIj ICLTGG 

t< 551 EDILKYFSQL PIEELLKDPL NPLNTENLID SLTMMLNNET EHSABGTLTI 

601 LSFS* 

A predicted signal peptide is highlighted. 

The cp7 102 nucleotide sequence <SEQ ID 62> is: 

1 ATGAAACATA CCTTTACCAA GCOTGTTCTA TTTTTTTTCT TTTTAGTGAT 
9n 51 TCCCATTCCC CTACTCCTCA ATCTTATGGT CGTAGGTTTT TTCTCATTTT 

ZV 101 CTGCCGCTAA AGCAAATTTA GTACAGGTCC TCCATACCCG TGCTACGAAC 

151 TTAAGTATAG AATTCGAAAA AAAACTGACG ATACACAAGC TTTTCCTCGA 

201 TAGACTTGCC AACACATTAG CCTTAAAATC CTATGCATCT CCTTCTGCAG 

251 AGCCCTATGC ACAGGCATAC AATGAGATGA TCGCACTCTC CAATACAGAC 

301 TTTTCCTTAT GCCTTATAGA TCCCTTTGAT GGATCTGTAA GGACGAAAAA 

ZD 351 TCCTGGAGAC CCTTTCATTC GCTATCTAAA ACAGCATCCT GAAATGAAGA 

401 AAAAGCTATC CGCAGCTGTA GGGAAAGCCT TTTTATTGAC CATTCCAGGT 

451 AAACCACTTT TACATTATCT TATTCTAGTT GAAGATGTCG CATCTTGGGA 

501 TTCTACAACG ACTTCAGGAC TGCTTGTAAG TTTCTATCCC ATGTCTTTTT 

o n 551 ^ACAGAAAGA TTTATTCCAA TCCTTACACA TCACCAAAGG AAATATCTGC 

JU 601 CTTGTAAATA AGTATGGCGA GGTCCTCTTC TGTGCTCAGG ACAGTGAATC 

651 TTCTTTTGTA TTTTCTCTAG ATCTCCCTAA TTTACCGCAA TTCCAAGCAA 

701 GAAGCCCCTC TGCCATAGAA ATTGAGAAAG CTTCTGGAAT TCTTGGTGGG 

751 GAGAACCTAA TCACAGTGAG TATCAACAAG AAACGCTACC TAGGATTGGT 

801 ACTGAATAAA ATTCCTATCC AAGGGACCTA CACTCTATCT TTAGTTCCAG 

- 50 851 TTTCTGATCT CATCCAATCC GCCTTGAAAG TTCCTCTCAA TATTTGOTTT 

901 TTCTATGTAC TTGCTTTCCT CCTCATGTGG TGGATTTTCT CTAAGATCAA 

951 CACCAAACTT AACAAGCCTC TTCAAGAACT GACCTTCTGT ATGGAAGCTG 

1001 CCTGGCGAGG AAACCATAAC GTGAGGTTTG AACCCCAGCC TTACGGTTAT 

n 105 1 GAATTCAATG AACTAGGAAA TATTTTCAAT TGCACTCTCC TACTCTTATT 

4U 1101 GAATTCCATT GAGAAAGCAG ATATCGATTA CCATTCAGGC GAAAAATTAC 

1151 AAAAAGAATT AGGGATTTTA TCTTCACTAC AAAGTGCGTT ACTAAGTCCG 

1201 GATTTCCCTA CGTTCCCTAA AGTTACCTTT AGTTCCCAAC ATCTCCGGAG 

1251 AAGGCAACTT TCCGGTCATT TTAATGGTTG GACAGTTCAA GATGGTGGCG 

A - 1301 ATACCCTTTT AGGGATCATA GGGCTCGCTG GCGATATTGG TCTTCCTTCC 

1351 TATCTCTATG CTTTATCCGC ACGGAGTCTT TTTCTTGCCT ATGCTTCCTC 

1401 GGACGTTTCG TTACAAAAAA TCAGCAAGGA TACTGCCGAC AGCTTCTCAA 

1451 AAACAAGAGA AGGCAATGAG GCTGTAGTTG CTATGACTTT CATTAAATAT 

1501 GTAGAAAAAG ATCGATCTCT AGAGCTCCTC TCGTTAAGCG AGGGAGCTCC 

1551 TACCATGTTT CTACAACGAG GAGAATCTTT CGTACGTCTC CCCTTAGAGA 

DU 1601 CTCACCAAGC TCTACAGCCT GGAGATCGGT TGATCTGCOT CACTGGAGGA 

1651 GAAGACATCC TCAAGTACTT TTCTCAGCTT CCTATTGAAG AGCTCTTAAA 

1701 AGATCCTTTA AAC CCTCTAA ATACAGAGAA TCTTATTGAT TCTCTAACCA 

1751 TGATGTTAAA CAACGAAACC GAACATTCTG CAGATGGAAC TCTGACCATC 

1801 CTTTCATTTT CATAA 

55 The PSORT algorithm predicts an inner membrane location (0.338), 

The protein was expressed in Rcoli and purified as a his-tag product and as a GST-fusion product. 
The purified GST-fusion product is shown in Figure 31 A. The recombinant GST-fusion protein was 
used to immunise mice, whose sera were used in a Western blot and for FACS analysis (Figure 3 IB). 
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These experiments show that cp7102 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 32 

The following ^pneumoniae protein (pid 4377106) was expressed <SEQ ID 63; cp7106>: 

5 1 MKDIiGTLGGT SSTAKTVSPD GKVTMGRSQI ADGSWHAFMC HTDFS SNNVL 

51 FDLDNTYKTL RENGRQLNSI FNLQNMMLQR ASDHEFTEFG RSNIALGAGL 

101 YVNALQNLPS NLAAQYFGIA YKIRPKYRLG VFLDHNFSSH VPNNFNVSHN 

151 RL.WMGAFIGW QDSDALGSSV KVSFGYGKQK ATITREQLEN TEAGSGESHF 

201 EGVAAQIEGR YGKSLGGHVR VQPFLGIiQFV HITRKEYTEN AVQFPVHYDP 

10 251 IDYSTGWYL GIGSHIALVD SLHVGTRMGM EQNFAAHTDR FSGSIASIGN 

301 FVFEKLDVTH TRAFAEMRVN YEL.PYLQSLN LILRVNQQPL QGVMGFSSDb 

351 RYALGF* 

The cp7106 nucleotide sequence <SEQ ID 64> is: 

1 ATGAAAGATT TGGGGACTCT TGGGGGTACC TCTTCTACAG CAAAAACAGT 
15 51 GTCCCCAGAT GGTAAAGTGA TCATGGGTAG ATCACAAATT GCTGATGGCA 

ATTTATGTCT CATACGGATT TCTCCTCTAA TAATGTACTC 
ATAATACGTA TAAAACTCTA AGAGAAAATG GCCGTCAGCT 
TTCAACCTAC AAAATATGAT GTTACAGAGA GCCTCAGATC 

AGAGTTTGGA AGGAGTAACA TCGCTCTTGG TGCCGGGCTT 

20 301 TATGTGAATG CCTTGCAGAA TCTCCCTAGC AATTTAGCAG CACAATATTT 

TACAAAATAC GTCCTAAATA TCGTTTGGGG GTGTTTTTGG 
CAGCTCCCAC GTTC CTAATA ATTTTAACGT AAGCCACAAT 
TGGGAGCCTT TATTGGATGG CAGGATTCTG ATGCTCTAGG 
AAGGTGTCTT TCGGATATGG AAAACAAAAA GCCACGATTA 
25 551 CAAGAGAGCA ATTAGAGAAT ACAGAAGCCG GGAGTGGGGA GAGCCATTTT 

CTGCTCAGAT AGAAGGGCGG TATGGTAAGA GCCTCGGAGG 
GTCCAGCCTT TCCTAGGACT GCAGTTTGTC CACATTACAA 
TACCGAAAAT GCAGTGCAAT TTCCTGTACA CTATGATCCT 
CTACAGGTGT AGTGTATTTA GGAATTGGAT CTCATATTGC 
30 801 ACTTGTAGAT TCTTTACATG TAGGCACACG CATGGGAATG GAGCAAAACT 

TACGGACAGG TTCTCAGGAT CTATAGCGTC TATTGGAAAC 
AAAAGCTTGA TGTGACTCAC ACAAGGGCAT TTGCGGAAAT 
TATGAGCTTC CCTATCTACA GTCTCTGAAT CTTATTCTAC 
ACAGCCTCTA CAAGGGGTTA TGGGATTTTC CAGTGATCTT 
35 1051 AGGTATGCCT TAGGATTCTA A 

The PSORT algorithm predicts a cytoplasmic location (0.224). 

The protein was expressed in Kcoli and purified as a his-tag product and as a GST-fusion product. 
The purified GST-fusion product is shown in Figure 32A. The recombinant GST-fusion protein was 
used to immunise mice, whose sera were used in a Western blot (Figure 32B) and for FACS analysis 
40 (Figure 32C). 

This protein also showed very good cross-reactivity with human sera, including sera from patients 
with pneumonitis. 

These experiments show that cp7106 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

45 Example 33 

The following Cpneumoniae protein (pid 437722 8) was expressed <SEQ ID 65; cp7228>: 

1 MTAVLILTSF PSEESARSXoA RHIiITERIjAS CVHVFPKGTS TYLWEGKLCE 
51 SEEHHIQ1KS IDIRFSEICL AIQEFSGYEV PEVLLFPIEN GDPKYUNWIiT 
101 ILSYPEKPPL SD* 



1 


ATGAAAGATT 


51 


GTCCCCAGAT 


101 


GTTGGCACGC 


151 


TTTGATCTCG 


201 


AAATTCCATA 


251 


ATGAGTTCAC 


301 


TATGTGAATG 


351 


TGGAATCGCA 


401 


ACCATAATTT 


451 


AGACTCTGGA 


501 


ATCTAGTGTC 


551 


CAAGAGAGCA 


601 


GAAGGGGTCG 


651 


ACATGTCAGG 


701 


GGAAAGAATA 


751 


ATAGACTATT 


801 


ACTTGTAGAT 


851 


TTGCAGCCCA 


901 


TTTGTGTTTG 


951 


GCGTGTCAAC 


1001 


GAGTTAATCA 


1051 


AGGTATGCCT 
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The cp7228 nucleotide sequence <SEQ ID 66> is: 

1 ATGACTGCTG TTCTTATTCT TACATCTTTC CCTTCGGAGG AAAGTGCTCG 

51 CTCCTTAGCT AGACATCTGA TTACAGAGCG TCTTGCTTCC TGTGTGCATG 

101 TATTCCCTAA AGGCACATCG ACATATCTAT GGGAAGGCAA GCTATGTGAG 

151 TCTGAAGAAC ATCATATACA AATCAAATCG ATAGACATAC GCTTCTCGGA 

201 AATTTGTCTT GCTATTCAGG AGTTCTCTGG CTATGAGGTT CCTGAAGTCT 

251 TACTATTTCC TATTGAAAAT GGGGATCCGA GGTACTTGAA TTGGTTAACG 

3 01 ATTCTCAGCT ATCCAGAGAA GCCTCCGCTT TCAGATTAG 

The PSORT algorithm predicts an inner membrane location (0.040). 

The protein was expressed in Kcoli and purified as a his-tag product and as a GST-fusion product, as 
shown in Figure 33A (his-tag = left-hand arrow, GST - right-hand arrow). The proteins were used to 
immunise mice, whose sera were used in a Western blot (Figure 33B) and FACS analysis. 

These experiments show that cp7228 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

15 Example 34 

The following C.pneumoniae protein (pid 437717 0) was expressed <SEQ ID 67; cp7170>: 

1 MKfSKMIiKHIiR LATX.SFSMFF GIVSSEAVYA LGAGNPAAPV LPGWPEQTG 

51 WCAFQXjCNSY DLFAALtAG SL KFGFYGDYVF SESAHITNVP VITSVTTSGT 

101 GTTPTITSTT KNVPFDLNNS SISSSCVFAT IALQETSPAA IPLLDIAFTA 

20 151 RVGGLKQYYR LPLNAYRDFT SNPLNAESEV TDGLIEVQSD YGIVWGLSIiQ 

2 01 KVLWKDGVSP VGVSADYRHG SSPINYIIVY NKANFEIYFD ATDGNLSYKE 
251 WSASIGISTY IiNDYVLPYAS VSIGNTSRKA PSDSFTELEK QFTNFKFKIR 

3 01 KI TNFDRVHF CFGTTCCISN NFYYSVEGRW GYQRA1NITS GLQF* 

A predicted signal peptide is highlighted. 
25 The cp7170 nucleotide sequence <SEQ ID 68> is: 

1 ATGAATAGCA AGATGCTAAA ACATTTACGT TTAGCAACCC TTTCCTTCTC 

51 TATGTTCTTC GGGATTGTAT CTTCTCCCGC AGTATATGCC CTAGGGGCTG 

101 GAAACCCTGC AGCTCCAGTA CTCCCAGGTG TGAATCCTGA GCAAACGGGA 

151 TGGTGTGCCT TCCAACTTTG TAATAGTTAC GATCTTTTTG CTGCTCTTGC 

30 201 AGGAAGCCTC AAATTTGGGT TCTATGGAGA TTATGTCTTC TCAGAAAGTG 

251 CCCATATTAC CAATGTCCCT GTCATTACCT CCGTTACGAC TTCAGGCACA 

301 GGAACAACGC CAACCATTAC CTCTACAACT AAAAACGTAG ACTTTGATCT 

351 TAACAACAGC TCCATCAGCT CGAGCTGTGT TTTTGCAACC ATAGCTCTAC 

401 AGGAAACATC CCCAGCTGCC ATTCCCCTTT TAGATATAGC CTTCACTGCA 

35 451 CGTGTCGGAG GACTTAAGCA GTACTACCGC CTCCCTCTCA ATGCTTACAG 

501 AGACTTCACT TCAAATCCTT TAAATGCAGA ATCTGAAGOT ACAGATGGTC 

551 TCATTGAAGT CCAGTCAGAC TATGGAATTG TCTGGGGTCT GAGTTTACAA 

601 AAAGTATTGT GGAAAGATGG AGTGTCTTTT GTAGGGGTGA GCGCTGACTA 

651 CCGTCACGGT TCCAGTCCCA TCAACTATAT CATCGTTTAC AACAAGGCCA 

40 701 ACC CCGAGAT CTATTTCGAT GCTACTGATG GAAACCTAAG CTATAAAGAA 

751 TGGTCTGCAA GCATCGGCAT CTCTACGTAT CTTAATGACT ATGTGCTTCC 

801 CTATGCATCC GTATCTATAG GAAATAC TTC AAGAAAAGCT CCTTCTGATA 

851 GCTTCACAGA ACTCGAAAAG CAATTTACGA ATTTTAAATT TAAAATTCGT 

901 AAAATCACAA ACTTCGACAG AGTAAACTTC TGCTTCGGAA CTACCTGCTG 

45 951 CATCTCAAAT AACTTCTACT ATAGTGTAGA AGGCCGTTGG GGATATCAGC 

1001 GTGCTATCAA CATTACGTCA GGTCTGCAGT TTTAG 

The PSORT algorithm predicts a bacterial outer membrane location (0.936). 

The protein was expressed in Kcoli and purified as a his-tag product and as a GST-fusion product. 
The purified GST-fusion product is shown in Figure 34A. The GST-fusion protein was used to 
50 immunise rnice> whose sera were used in a Western blot (34B) and for FACS analysis (34C). 
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The cp7170 protein was also identified in the 2D-PAGE experiment (Cpn0854). 

These experiments show that cp7170 is a surface-exposed and imrnunoaccessible protein, and that it 
is a useful irnrnunogen. These properties axe not evident from the sequence alone. 

Example 35 

5 The following C.pneumoniae protein (pid 4377 072) was expressed <SEQ ID 69; cp7072>: 

1 MDIKKIiFCLF LCSSLIAMSB IYGKTGDYEK LTIiTGINIID RNGjLSETICS 

51 KEKLKKYTKV DFLAPQPYQK VMRMYKNKRG DNVSCLTAYH TNGQIKQYLE 

101 CLNNRAYGRY REWHVNGNIK IQAEVIGGIA DLHPSAESGW LFDQTTFAYN 

151 DEGILEAAIV YEKGLLEGSS VYYHTNGNIW KECPYHKGVP QGKFLTYTSS 

10 201 GKLLKEQNYQ QGKRHGLSIR YSEDSEEDVL AWEEYHEGRL LKAEYIjDPQT 

251 HEIYATIHEG NGIQAIYGKY AVIETRAFYR GEPYGKVTRF DNSGTQIVQT 

301 YNIiLQGAKHG EEFFFYPETG KPKLI»IiNWHE GILWGIVKTW YPGGTLESCK 

351 EIjVWNKKSGL LTIYYPEGQI MATEEYDNDL LIKGEYFRPG DRHPYSK IDR 

401 GCGTAVFFSS AGTITKKIPY QDGKPLLN* 

15 A predicted signal peptide is highlighted. 

The cp7072 nucleotide sequence <SEQ ID 70> is: 

1 ATGGATATAA AAAAACTCTT TTGCTTATTT CTATGTTCTT CTCTAATTGC 

51 CATGAGTCCC ATTTATGGGA AAACAGGTGA CTATGAGAAA CTCACCCTTA 

101 CAGGGATCAA TATCATTGAT AGAAACGGCC TGTCAGAAAC TATTTGCTCT 

20 151 AAAGAGAAGC TAAAGAAATA CACCAAGGTA GACTTTCTTG CTCCCCAGCC 

201 CTATCAAAAG GTCATGAGGA TGTATAAAAA CAAACGCGGA GATAACGTTT 

251 CTTGTTTAAC AGCCTATCAC ACTAACGGGC AAATTAAGCA GTACCTGGAG 

301 TGTCTCAATA ATCGTGCTTA TGGAAGATAT CGTGAATGGC ACGTCAACGG 

351 GAATATCAAA ATCCAAGCTG AGGTTATCGG AGGTATTGCG GATCTTCATC 

25 401 CCTCAGCAGA GTCTGGCTGG CTATTTGATC AAACTACATT TGCCTATAAT 

451 GATGAAGGTA TCTTAGAAGC CGCTATCGTC TATGAAAAAG GGCTGCTCGA 

501 AGGATCTTCG GTGTATTACC ATACTAATGG GAATATTTGG AAAGAGTGTC 

551 CCTATCATAA GGGAGTTCCT CAAGGTAAAT TCCTGACATA CACATCTTCG 

601 GGGAAACTGC TCAAAGAACA GAATTACCAA CAAGGCAAAA GACACGGTCT 

30 651 TTCGATTCGC TACAGCGAAG ATTCCGAAGA AGATGTTCTA GCCTGGGAAG 

701 AATATCATGA GGGACGACTC CTAAAAGCAG AGTACTTAGA TCCTCAAACT 

751 CACGAAATCT ATGCGAC TAT ACACGAAGGG AACGGCATTC AAGCAATCTA 

801 CGGCAAGTAT GCCGTTATAG AAACTAGGGC ATTTTACCGA GGGGAACCTT 

851 ATGGAAAAGT TACCAGATTC GACAACTCCG GAACACAGAT TGTCCAAACG 

35 901 TATAACCTTT TGCAAGGCGC GAAGCACGGA GAAGAATTTT TCTTTTATCC 

951 TGAGACAGGG AAACCCAAGC TGCTTCTTAA TTGGCATGAA GGAATTTTAA 

1001 ATGGGATAGT AAAAACTTGG TATCCCGGAG GAACCTTAGA AAGTTGTAAA 

1051 GAACTCGTAA ATAACAAAAA ATCCGGGTTA CTGACCATTT ACTACCCTGA 

1101 AGGACAGATC ATGGCGACCG AAGAGTATGA TAATGATCTT CTAATTAAAG 

40 1151 GAGAGTACTT CCGCCCTGGA GACCGTCATC CCTACTCTAA AATAGATCGT 

1201 GGTTGTGGGA CTGCAGTATT TTTCTCGTCG GCGGGAACTA TTACTAAAAA 

1251 AATCCCCTAT CAGGACGGCA AACCTTTGCT CAACTAG 

The PSORT algorithm predicts a periplasmic location (0.688). 

The protein was expressed in Rcoli and purified as a his-tag product (Figure 35 A) and as a GST- 
45 fusion product (Figure35B). The recombinant his-tag protein was used to immunise mi ce, 
whose sera were used in a Western blot (Figure 35C) and for FACS analysis. 

These experiments show that cp7072 is a useful immunogen. These properties are not evident from 
the sequence alone. 

Example 36 

50 The following Cpneumoniae protein (pid 4376879) was expressed <SEQ ID 71; cp6879>: 
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1 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 



MATPAQKSPT 
TlVKVSLIIIi 
CI/TDSOGLPE 
PAVPQVWDC 
RSLVADRLEF 
IjRSRIDDEQK 
QLEKDLRRQL 
FDEQSLFYRE 
EQKDANLKKA 
KVEKDFQELQ 
KLADUEGAAA 
SNEXiTQLVAD 
RKCCDL.ESLL 
G* 



FQDPSFVREL 
ALLTILGGGL 
ELPPVPEPQQ 
EKRLGMLDRK 
WRRSYERFVQ 
RCWTALQRIN 
KSMQEWIEMR 
YKEKYLSQKL 
AAVWEEELGK 
QRYSRLQEEK 
PTEIGEDDDW 
AVEAEKEISK 
SPVREDAGMR 



GSNHPVFSPL 
LVGLLPAVPM 
IQIEDLRNET 
LRREEEILYR 
GIMTVRSEEG 
QSQKDIQRAH 
GTIHQQEKAW 
DMQKILQEVW 
QQQEDYEQTQ 
QVKEKILEES 
VLTDSASLSQ 
LREHIEEQKE 
FELEVEIjQRIj 



TLEERGEMAI 
FIGTGLIALG 
REVLEGTLLE 
STAHLKDEER 
EKEISRLQDL 
DREASQRACE 
RKQNAKLERL 
AEKSEKACLE 
EIRRIiSTFIL 
MNHFADLFEK 
KK1RELVEEN 
GLRAUDKMHA 
QEENAQLRAE 



ARVQQCGWNH 
AVIFALALIL 
VLLKDRDAKD 
YEFLLELLEM 
ISLQQQTVQD 
GTEMDCAERQ 
QEDLRI/TGIA 
SliVHDYEKQL 
EYQJDSLREAE 
AQKENMAYKK 
QEDLKAIiAFK 
QAIKDCEAAQ 
VERLEQEQFQ 



The cp6879 nucleotide sequence <SEQ ID 72> is: 



l 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 



ATGGCAACAC 

AAGAGAGCTA 

AAAGAGGGGA 

ACAATTGTTA 

GGGAGGATTA 

CAGGTCTGAT 

TGTCTTTATG 

ACCACAACAA 

TTGAAGGGAC 

CCTGCGGTGC 

GGATCGTAAG 

ATCTTAAAGA 

CGTAGTCTGG 

ATTTGTTCAA 

TTTCTCGTCT 

TTAAGGAGTC 

ACGTATTAAC 

CTTCGCAGCG 

CAACTGGAGA 

TGAGATGAGG 

ATGCCAAATT 

TTTGACGAAC 

TCAGAAACTA 

GTGAGAAGGC 

GAACAAAAAG 

ATTAGGGAAG 

GTCTGAGTAC 

AAAGTTGAGA 

AGAGGAGAAA 

TTGCCGATCT 

AAGTTAGCGG 

CGATGACTGG 

GCGAACTCGT 

TCTAACGAAT 

AATCAGCAAG 

CTCTTGATAA 

AGAAAATGCT 

TGGAATGAGA 

ATGCACAGCT 

GGATAA 



CCGCTCAAAA 

GGCAGTAACC 

GATGGGAATA 

AGGTAAGTCT 

CTCGTAGGAT 

TGCTTTGGGA 

ATTCTCAGGG 

ATTCAGATTG 

TCTTTTAGAG 

CCCAGGTGGT 

CTGCGACGTG 

CGAGGAAAGG 

TTGCCGATCG 

GGAATTATGA 

ACAAGATCTA 

GGATCGATGA 

CAATCTCAGA 

TGCCTGTGAG 

AGGATTTAAG 

GGCACAATCC 

AGAAAGATTA 

AATCTCTGTT 

GATATGCAAA 

TTGCTTAGAG 

ATGCTAATCT 

CAGCAACAGG 

ATTCATTCTT 

AAGATTTCCA 

CAGGTAAAAG 

CTTTGAGAAG 

ATTTAGAGGG 

GTACTCACAG 

GGAAGAGAAT 

TGACTCAACT 

CTTCGAGAAC 

GATGCATGCA 

GTGACCTTGA 

TTTGAGCTAG 

TAGAGCGGAG 



ATCCCCTACA 

ACCCTGTCTT 

GCTCGAGTCC 

TATTATTCTT 

TGCTGCCAGC 

GCCGTTATAT 

CCTTCCTGAG 

AAGATTTAAG 

GTTCTCTTAA 

TGTAGACTGT 

AAGAGGAGAT 

TATGAGOTCT 

GCTAGAATTT 

CAGTTAGATC 

ATCAGTTTGC 

CGAGCAGAAG 

AGGATATACA 

GGCACAGAGA 

GAGACAGCTG 

ATCAACAAGA 

CAAGAGGATC 

CTATCGCGAA 

AGATTTTACA 

AGTCTGGTCC 

GAAGAAAGCA 

AAGACTACGA 

GAGTACCAGG 

AGAGCTACAA 

AAAAAATCTT 

GCTCAAAAGG 

TGCCGCTGCT 

ATTCTGCTTC 

CAAGAACTCC 

GGTTGCCGAT 

ACATAGAAGA 

CAAGCGATCA 

GAGCCTTCTC 

AGGTCGAGCT 

GTTGAAAGAC 



TTTCAAGATC 
TTCCCCGCTA 
AGCAGTGTGG 
GCTCTTCTTA 
AGTTCCTATG 
TTGCTTTGGC 
GAACTCCCTC 
AAACGAGACC 
AGGATAGAGA 
GAAAAGCGTC 
TCTGTATCGC 
TGCTGGAGCT 
AACCGTAGAA 
AGAGGAGGGG 
AGCAGCAGAC 
AGATGCTGGA 
ACGGGCTCAT 
TGGATTGTGC 
AAATCTATGC 
GAAGGCTTGG 
TGAGACTTAC 
TATAAAGAGA 
GGAAGTCAAC 
ATGACTATGA 
GCAGCTGraT 
ACAAACCCAA 
ACAGTCTGCG 
C AAAGG TATA 
AGAAGAAAGT 
AAAACATGGC 
CCTACTGAGA 
TCTCAGCCAG 
TGAAAGCACT 
GCTGTAGAAG 
GCAGAAAGAA 
AAGATTGCGA 
TCTCCTGTTC 
TCAAAGATTG 
TAGAGCAAGA 



CTAGTTTTGT 
ACGCTTGAGG 
ATGGAATCAT 
CTATTTTAGG 
TTTATTGGAA 
TTTGATTTTA 
CGGTTCCTGA 
AGAGAAGTTC 
CGCTAAGGAC 
TTGGAATGTT 
TCGACGGCCC 
CTTGGAAATG 
GTTATGAGCG 
GAAAAAGAGA 
GGTGCAAGAT 
CGGCTTTACA 
GATCGCGAGG 
AGAACGCCAG 
AGGAGTGGAT 
CGTAAGCAGA 
TGGGATTGCT 
AATATCTGAG 
GCAGAGAAAA 
GAAGCAGCTC 
GGGAAGAAGA 
GAAATTAGAC 
TGAGGCAGAA 
GCCGTCTTCA 
ATGAATCATT 
CTACAAGAAG 
TCGGTGAGGA 
AAGAAGATCC 
TGCATTTAAA 
CTGAAAAAGA 
GGATTACGAG 
AGCTGCTCAG 
GAGAAGATGC 
CAAGAAGAAA 
GCAATTTCAA 



The PSORT algorithm predicts an inner membrane location (0.646). 

The protein was expressed in Kcoli and purified as a his-tag product and as a GST-fusion product. 
The purified GST-fusion product is shown in Figure 36A. The recombinant GST-fusion protein was 
used to immunise mice, whose sera were used in a Western blot (Figure 36B) and for FACS analysis. 

These experiments show that cp6879 is useful immunogen. These properties are not evident from 
the sequence alone. 
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Example 37 

The following C.pneumoniae protein (pid 43767 67) was expressed <SEQ ID 73; cp6767>: 

1 MIKQIGRFFR AFIFIMPLSL TSCESKIDRN RIWIVGTNAT YPPFEYVDAQ 

51 GEWGFDIDIi AKAISEKLGK QLEVREFAFD ALILNLKKHR IDAILAGMSI 

~> 101 TPSRQKEIAL LPYYGDEVQE LMOTSKRSLE TPVLPLTQYS SVAVQTGTFQ 

151 EHYLLSQPGI CVRSFDSTLE VIMEVRYGKS PVAVLEPSVG RWLKDFPNL 

201 VATRLEIiPPE CWVLGCGLGV AKDRPEEIQT IQQAITDLKS EGVlQSbTKK 

251 WQIjSEVAYE* 

The cp6767 nucleotide sequence <SEQ ID 74> is: 

^ 1 atgataaaac aaataggccg tttttttaga gcatttattt ttataatgcc 

51 tttatcttta acaagttgtg agtctaaaat cgatcgaaat cgcatctgga 

101 ttgtaggtac gaatgctaca tatcctcctt ttgagtatgt ggatgctcag 

151 ggggaagttg taggtttcga tatagatttg gcaaaggcaa ttagtgaaaa 

1S 201 acttggcaag caattggaag ttagagaatt cgctttcgat gctttaattt 

i:> 251 taaatttaaa aaaacatcgt atcgatgcaa ttttagcagg aatgtccatt 

301 actccttcgc gtcagaagga aatcgccctg cttccctatt atggcgatga 

351 ggttcaagag ctgatggtgg tttctaagcg gtctttagag acccctgtgc 

401 TTCCCCTAAC ACAGTATTCT TCTGTTGCTG TTCAGACAGG AACGTTTCAG 

9n 451 GAGCATTATC TTTTATCTCA GCCCGGAATT TGTGTCCGTT CTTTTCATAG 

^ 501 CACCTTGGAG GTGATTATGG AAGTTCGTTA TGGGAAATCT CCGGTTGCCG 

551 TTCTAGAACC C TCGGTAGGA CGTGTCGTTC TTAAAGACTT CCCTAATCTT 

601 GTTGCAACAA GATTAGAGCT CCCTCCTGAA TGTTGGGTGT TGGGCTGTGG 

651 TCTCGGCGTA GCTAAAGATC GTCCTGAAGA AATACAAACG ATTCAACAAG 

0< - 701 CGATTACAGA TTTAAAGAGC GAAGGGGTGA TTCAATCTTT AACCAAGAAA 

£5 751 TGGCAACTTT CTGAAGTTGC TTACGAATAG 

The PSORT algorithm predicts an inner membrane location (0.083). 

The protein was expressed in Kcoli and purified as a his-tag product and as a GST-fusion product. 
The purified his-tag product is shown in Figure 37A. The recombinant his-tag protein was used to 
immunise mice, whose sera were used in a Western blot (Figure 37B) and for FACS analysis (Figure 
30 37C). The GST-fusion was also used in a Western blot (Figure 37D). 

The cp6767 protein was also identified in the 2D-PAGE experiment and showed good 
cross-reactivity with human sera, including sera from patients with pneumonitis. 

These experiments show that cp6767 is a useful immunogen. These properties are not evident from 
the sequence alone. 

35 Example 38 

The following C.pneumoniae protein (pid 4376717) was expressed <SEQ ID 75; cp6717>: 

1 MMSRLRFRIA ALGIFFILLV PNSVSA KTIV ASDKEKVGVL VYDNSVEAFQ 

51 QILDCIDHAN FYVELCFCMT GGRTLKEMVD HLEARMDLVP ELCSYHIQP 

101 TFTDAEDQKL LKALKERHPN RFFYVFTGCP PSTSILAPNV IEMHXKLSII 

40 151 DGKYCILGGT NFEEFMCTPG DEVPEKVDNP RLFVSGVRRP LAFRDQDIML 

201 RSTAFGLQXjR EEYHKQFAMW DYYAHHMWFI DNPEQFAGAC PPLTLEQAEE 

251 TVFPGFDKHE DLVLVDSSKI RIVLGGPHDK QPNPVTQEYL KL1QGARSSV 

301 KlaAHMYFXPK DEDLNALVDV SHNHGVHLSL XTNGCHELSP AITGPYAWGN 

351 RIWYFALLYG KRYPLWKKWF CEKLKPYERV SIYEFAIWET QLHKKCMI ID 

45 401 DEIFVIGSYH FGKKSDAFDY ESIWIESPE VAAKANKVFKT KDIGLSIPVS 

451 HGDIFSWYFH SVHHTLGHLQ LTYMPA* 

A predicted signal peptide is highlighted. 



The cp6717 nucleotide sequence <SEQ ID 76> is: 
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10 



15 



20 



25 



i 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 



ATGATGAGTC 
TTTGCTGGTT 
AGGAGAAGGT 
CAGATATTGG 
CTGCATGACA 
CTCGTATGGA 
ACGTTTACCG 
TCATCCCAAC 
GCATCCTCGC 
GATGGGAAAT 
CACTCCAGGG 
TCAGTGGAGT 
CGTTCTACAG 
TGCTATGTGG 
AACAGTTTGC 
ACAGTATTTC 
TTCCAAGATC 
CTGTGACTCA 
AAGCTTGCTC 
TGTCGACGTT 
GCTGTCATGA 
CGTATTAACT 
AAAATGGTTT 
AGTTTGCTAT 
GATGAAATTT 
CTTTGATTAC 
AAGCTAACAA 
CATGGCGACA 
ACATTTGCAG 



GGTTGCGTTT 

CCTAATTCTG 

TGGAGTTCTT 

ATTGCATAGA 

GGAGGCCGAA 

TCTGGTTCCA 

ATGCTGAAGA 

CGGTTTTTCT 

TCCTAATGTC 

ATTGTATTTT 

GATGAGGTTC 

GCGTCGGCCC 

CATTCGGTTT 

GACTACTATG 

AGGCGCCTGT 

CTGGATTTGA 

AGGATAGTTT 

AGAATATTTG 

ACATGTATTT 

TCTCATAATC 

ATTAAGTCCT 

ATTTCGCCTT 

TGCGAAAAGC 

TTGGGAAACG 

TTGTGATCGG 

GAAAGTATTG 

AGTCTTCAAT 

TTTTCTCTTG 

CTGACCTATA 



TCGCTTGGCA 

TTTCAGCAAA 

GTTTATGACA 

TCATGCAAAT 

CGCTTAAAGA 

GAGCTCTGTA 

CCAAAAATTA 

ACGTTTTTAC 

ATTGAAATGC 

AGGTGGTACC 

CTGAGAAAGT 

CTAGCATTTC 

GCAGCTCAGA 

CACATCATAT 

CCTCCACTGA 

CAAACATGAA 

TAGGTGGTCC 

AAACTTATCC 

CATCCCTAAG 

ACGGTGTTCA 

GCAATTACAG 

GCTCTATGGG 

TAAAACCTTA 

CAGTTGCACA 

AAGTTATAAT 

TAGTTATCGA 

AAAGATATCG 

GTATTTCCAT 

TGCCAGCCTA 



GCTCTTGGAA 
GACAATCGTA 
ATAGTGTAGA 
TTTTATGTAG 
GATGGTAGAT 
GCTATATCAT 
CTCAAAGCTC 
AGGGTGCCCA 
ATATCAAACT 
AATTTTGAAG 
GGATAACCCA 
GTGATCAGGA 
GAAGAATATC 
GTGGTTCATT 
CTTTAGAACA 
GATCTTGTTC 
CCACGATAAG 
AGGGAGCTAG 
GACGAGCTTT 
TCTGAGTTCPA 
GACCCTATGC 
AAACGGTATC 
TGAGCGGGTT 
AGAAGTGTAT 
TTTGGAAAGA 
ATCTCCAGAA 
GATTGTCGAT 
TCCGTACACC 
G 



TATTTTTTAT 

GCTTCAGACA 

GGCCTTTCAA 

AACTGTGTCC 

CACCTCGAGG 

TATCCAACCC 

TCAAAGAACG 

CCCTCAACAA 

TTCTATCATC 

AGTTTATGTG 

CGTTTATTTG 

TATCATGTTG 

ATAAGCAATT 

GATAATCCTG 

AGCCGAGGAG 

TTGTCGACTC 

CAACCCAATC 

ATCTTCTGTG 

TAAATGCTCT 

ATTACGAACG 

TTGGGGAAAC 

CTCTTTGGAA 

TCTATTTATG 

GATTATCGAT 

AAAGTGATGC 

GTCGCTGCAA 

TCCTGTAAGT 

ACACTTTGGG 



30 The PSORT algorithm predicts a periplasmic location (0.939). 



35 



The protein was expressed in Kcoli and purified as a GST-fusion (Figure 38A), as a his-tagged 
protein, and as a GST/his fusion product. The proteins were used to immunise mice, whose sera were 
used in a Western blot (Figure 38B) and for FACS analysis. 

These experiments show that cp6717 is a useful immunogen. These properties are not evident from 
the sequence alone. 



Example 39 

The following ^pneumoniae protein (pid 4376577) was expressed <SEQ ID 77; cp6577>: 

1 MKKLLFSTFL LVLGSTSAAH ANLGYVNLKR CLEESDLGKK ETEELEAMKQ 

An 51 QFVKNAEKIE EELTSIYNKL QDEDYMESLS DSASEELRKK FEDLSGEYNA 

4U 101 YQSQYYQSIN QSNVKRIQKL IQEVKIAAES VRSKEKLEAI LNEEAVLAIA 

151 PGTDKTTEII AILNESFKKQ N* 

A predicted signal peptide is highlighted. 

The cp6577 nucleotide sequence <SEQ ED 78> is: 

1 ATGAAAAAAT TATTATTTTC TAC ATTTC TT CTTGTTTTAG GATCAACAAG 

45 51 CGCAGCTCAT GCAAATTTAG GCTATGTTAA TTTAAAGCGA TGTCTTGAAG 

101 AATCCGATCT AGGTAAAAAG GAAAC TGAAG AATTGGAAGC TATGAAACAG 

151 CAGTTTGTAA AAAATGCTGA GAAAATAGAA GAAGAACTCA CTTCTATTTA 

201 TAATAAGTTG CAAGATGAAG ATTACATGGA AAGCCTATCG GATTCTGCCT 

251 CTGAAGAGTT GCGAAAGAAA TTCGAAGATC TTTCAGGAGA GTACAATGCG 

50 301 TACCAGTCTC AGTACTATCA ATCTATCAAT CAAAGTAATG TAAAACGCAT 

351 TCAAAAACTC ATTCAAGAAG TAAAAATAGC TGCAGAATCA GTGCGGTCCA 

401 AAGAAAAACT AGAAGCTATC CTTAATGAAG AAGCTGTCTT AGCAATAGCA 

451 CCTGGGACTG ATAAAACAAC CGAAATTATT GCTATTCTTA ACGAATCTTT 

501 CAAAAAACAA AACTAG 

55 The PSORT algorithm predicts a periplasmic space location (0.932). 
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The protein was expressed in Kcoli and purified as a his-tag product (Figure 39 A) and as a GST- 
fusion product (Figure 39B). The recombinant GST-fusion protein was used to immunise mice, 
whose sera were used in a Western blot (Figure 39C) and for FACS analysis. 

The cp6577 protein was also identified in the 2D-PAGE experiment. 

5 These experiments show that cp6577 is a useful immunogen. These properties are not evident from 
the sequence alone. 

Example 40 

The following ^pneumoniae protein (pid 4376446) was expressed <SEQ ID 79; cp6446>: 

1 MKQPMSLIFS SVCIiGIiGLGS I,SS CKQKPSW NYHNTSTSEE FFVHGNKSVS 

10 51 QIjPHYPSAFR TTQIFSEEHN DPYVVAKTDE ESRKIWREIH KNLKIKGSYI 

101 PISTYGSIiMH PKSAALTLKT YRPHPIW1NG YERSFNIDTG K YLKNG S RRR 

151 TSHDGPKNRA VLNLIKSSGR RCNAIGLEMT EEDFVIARRR EGVYSL.YPVE 

201 VCSYPQGNPF VIAYAWIADE SACSKEVLFV KGYYSLVWES VSSSDSLNAF 

251 GDSFAEDYLR STFLANGTSI LCVHESYKKV PPQP* 

15 A predicted signal peptide is highlighted. 



The cp6446 nucleotide sequence <SEQ ID 80> is: 

1 ATGAAACAGC CCATGTCTCIT TAWTT' 



ATGAAACAGC CCATGTCTCT TATCTTTTCA AGTGTATGTT TAGGATTAGG 

51 TCTTGGATCT CTTTCCTCCT GTAATCAAAA GCCCTCTTGG AATTATCACA 

101 ACACTTCAAC GAGCGAAGAA TTCTTTGTTC ATGGAAATAA. GAGTGTTTCG 

20 151 CAACTGCCTC ATTATCCTTC TGCATTTCGT ACGACTCAAA TCTTTTCTGA 

2 01 AGAGCACAAT GATCCTTATG TCGTAGCTAA GACTGATGAA GAGTCTCGTA 

251 AAATTTGGAG AGAAATCCAT AAAAATCTCA AAATCAAAGG TTCTTACATT 

301 CCCATATCGA CTTATGGAAG TCTGATGCAC CCAAAATCAG CAGCTCTTAC 

351 ATTAAAAACG TATCGTCCAC ATCCTATTTG GATAAATGGA TACGAGCGTT 

25 401 CTTTTAATAT AGACACAGGA AAGTACTTAA AAAACGGAAG TCGCCGTAGA 

451 ACTTCTCACG ATGGTCCGAA AAATCGAGCT GTACTGAATC TCATTAAATC 

501 TTCGGGACGA CGCTGTAATG CTATAGGCCT TGAGATGACA GAAGAAGACT 

551 TTGTAATAGC TAGAAGGCGA GAAGGTGTTT ATAGCCTGTA TCCCGTTGAA 

601 GTGTGCTCGT ATCCTCAGGG GAATCCTTTT GTCATTGCTT ATGCCTGGAT 

30 651 TGCAGATGAG AGTGCTTGCT CAAAAGAGGT CCTACCTGTA AAAGGGTACT 

701 ATTCTTTAGT CTGGGAAAGC GTTTCTTCCT CTGATTCTCT GAATGCTTTT 

751 GGAGATTCCT TTGCAGAGGA CTACCTCAGA AGCACGTTTT TAGCAAACGG 

801 AACTTCTATA CTCTGTGTTC ATGAAAGCTA TAAGAAAGTT CCTCCTCAGC 

851 CCTAA 

35 The PSORT algorithm predicts an inner membrane location (0.177). 

The protein was expressed in Exoli and purified as a his-tag product and a GST-fusion product. The 
GST-fiision product is shown in Figure 40A. The recombinant his-tag protein was used to immunise 
mice, whose sera were used in a Western blot (Figure 40B) and for FACS analysis. 

These experiments show that cp6446 is a useful immunogen. These properties are not evident from 
40 the sequence alone. 



Example 41 

The following ^pneumoniae protein (pid 4377108) was expressed <SEQ ED 81; cp7108>: 

1 MSKKIKVLGH LTLCTLFRGV kCAA ALSNIG YASTSQESPY QKSIEDWKGY 

51 TFTDLELLSK BGWSEAHAVS GNGSRIVGAS GAGQGSVTAV IWESHLIKHL 

45 101 GTLGGEASSA EGISKDGEW VGWSDTREGY THAFVFDGRD MKDLGTLGAT 

151 YSVARGVSGD GSIIVGVSAT ARGEDYGWQV GVKWEKGKIK QLKLLPQGLW 
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201 SEANAISEDG TVIVGRGEIS RNHIVAVKWN KNAVYSLGTL GGSVASAEAI 
251 SANGKVIVGW STTNNGETHA FMHKDETMHD LGTLGGGPSV ATGVSADGRA 
301 IVGFSAVKTG EXHAFYYAEG EMEDLTTLGG EEARVFDISS EGNDIIGSIK 
351 TDAGAERAYL FHIHK* 



A predicted signal peptide is highlighted. 
The cp7108 nucleotide sequence <SEQ ID 82> is: 



1 ATGAGTAAGA AGATAAAGGT TCTAGGTCAT TTGACGCTCT GCACTCTGTT 
51 TAGAGGAGTG CTGTGTGCAG CGGCCCTTTC CAACATAGGA TATGCGAGTA 



101 CTTCTCAGGA ATCACCATAT CAGAAGTCTA TAGAAGACTG GAAAGGGTAT 

151 ACCTTTACAG ATCTTGAGTT ACTGAGTAAG GAAGGGTGGT CTGAAGCTCA 

201 TGCAGTTTCT GGAAATGGCA GTAGAATTGT AGGAGCTTCG GGAGCTGGCC 

251 AAGGTAGTGT GACTGCTGTC ATATGGGAAA GTCACCTCAT AAAACATCTC 

301 GGCACTTTAG GTGGCGAGGC TTCATCTGCA GAGGGAATTT CAAAGGATGG 

351 AGAGGTGGTC GTTGGGTGGT CAGATACTAG AGAGGGATAT ACTCATGCCT 

401 TTGTCTTCGA CGGTAGAGAT ATGAAAGATC TCGGTACTCT AGGAGCTACC 

451 TATTCTGTAG CAAGGGGTGT TTCTGGAGAT GGTAGTATCA TCGTAGGAGT 

501 CTCTGCAACT GCTCGTGGAG AGGATTACGG ATGGCAAGTT GGTGTCAAGT 

551 GGGAAAAAGG GAAAATCAAA CAATTGAAGT TGTTGCCTCA AGGTCTCTGG 

601 TCTGAGGCGA ATGCAATCTC TGAGGATGGT ACGGTGATTG TCGGGAGAGG 

651 GGAAATCTCT CGCAATCACA TCGTTGCTGT AAAATGGAAT AAAAATGCTG 

701 TGTATAGTTT GGGGACTCTC GGAGGTAGTG TCGCTTCAGC AGAGGCTATA 

751 TCGGCAAATG GGAAAGTAAT TGTAGGATGG TCCACGACTA ATAATGGTGA 

801 GACTCATGCC TTTATGCACA AAGATGAGAC AATGCACGAT CTCGGCACTC 

851 TAGGAGGAGG TTTTTCTGTC GCAACTGGAG TTTCTGCTGA TGGGAGAGCC 

901 ATCGTAGGAT TTTCAGCAGT GAAGACCGGA GAAATTCATG CTOTTTACTA 

951 TGCAGAAGGA GAAATGGAGG ATTTAACAAC TTTGGGAGGG GAAGAAGCTC 

1001 GAGTGTTCGA CATATCTAGC GAAGGAAACG ATATCATTGG CTCTATAAAA 

1051 ACTGACGCTG GAGCTGAACG CGCCTATCTG TTCCATATAC ATAAATAA 

The PSORT algorithm predicts an outer membrane location (0.921). 

The protein was expressed in Kcoli and purified as a GST-fusion product, as shown in Figure 41A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 41B) and for FACS analysis (Figure 41C). A his-tagged protein was also expressed. 
The cp7108 protein was also identified in the 2D-PAGE experiment. 

These experiments show that cp7108 is a surface-exposed and immunoaccessible protein, and that it 
is a useful imrnunogen. These properties are not evident from the sequence alone. 

Example 42 

The following Cpneumoniae protein (pid 4377287) was expressed <SEQ ID 83; cp7287>: 



l 

51 



MVAKKTVRSY RSSFSHSVXV AILSAGIAFE AWflT. w.g.gpT.n LGVFNKQFEE 
HSAHVEEAQT SVLKGSDPVN PSQKESEKVL YTQVPLTQGS SGESLDLADA 

101 NFLEHFQHLF EETTVFGIDQ KLVW SDIjDTR INFSQPTQEPD TSNAVSEKIS 

151 SDTKENRKDL ETEDPSKKSG LKEVS SDLPK SPETAVAAIS EDLEISEWIS 

201 ARDPLQGLAF FYKNTSSQSI SEKDSSFQGI IFSGSGANSG LGFENLKAPK 

251 SGAAVYSDRD IVFENLVKGL SFISCESLED GSAAGVWIW THCGDVTLTD 

301 CATGLDLEAL RLVKDF SRGG AVFTARNHEV QNNLAGGILS WGNKGAIW 

351 EKNSAEKSNG GAFACGSFW SNNENTAXiWK ENOALSGGAI SSASDIDIQG 

401 NCSA1EFSGN QSLIALGEHI GLTDFVGGGA LAAQGTLTLR NNAWQCVKW 

451 TSKTHGGAIL AGTVDLNETI SEVAFKQNTA AMGGALSAN DKVIIANWFG 

IVi ^ LFEQNEVR NHGGAIYCGC RSNPKLEQKD SGENINIIGN SGAITFLKNK 

551 ASVLEVMTQA EDYAGGGALW GHNVLLDSWS GNIQFIGNIG GSTFWIGEW 

601 GGGAILSTDR VTISNNSGDV VFKGNKGQCt, AQKYVAPQET APVESDASST 

701 *™ SLNAC SHGDHYPPKT VEEEVPPSLL EEHPWSSTD IRGGGAILAQ 

IVi ^1™^ EESSTVGDIA IVGGGAIiLST NEVWVCSNQN 

ftOI ^^T 7 ^ GCDSGGAI ^ A KKVDISANHS VEFVSNGSGK FGGAVCALITE 

801 SVNXTDNGSA VSFSKNRTRL GGAGVAAPQG SVTICGNQGN IAFKENFVFG 
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10 



15 



851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 



SEWQRSGGGA 
SNPRTLTITG 
GNVSFYGNRA 
CGNDSKIVEL 
SKPQDDSAQH 
ETGSSIVLSA 
NKDKAVDTPV 
IDLKIIDPTN 
IDVSLiPSITP 
NNLWSHYTDL 
EFDGFKHHLT 
YMGAAYAGIL 
GTSIDYRYIV 
KTRFENVAIP 
DAAYSWKSYG 
FNGGIRIIF* 



IIANSSVNIQ 
NSGDILFAKN 
PSGAGVQIAD 
SAVQDKNIIF 
HEGTIRFSRG 
GSILRIFDSQ 
LADIISITVD 
VGYENHAIiliS 
ATYGHTGVWS 
RALKQEIFAH 
GYAIjGLDTQI* 
AGPWLIKGAF 
NPRRFISAIV 
FGFALEHAYS 
VDIPCKAWKA 



DNAGDILFVS 
STQTAASLSE 
GGTVCLEAFG 
QDAITYEENT 
VSKIPQIAAI 
VDSSAPLPTE 
LSSFVPEQDG 
SHKDIPLISL 
ESKMEDGRLV 
HTIAQRMELD 
VEDFLIGGCF 
VYGNIMNDI/P 
STWPFVEAE 
RGSRAEVNSV 
RLSNNTEWNS 



NSTGSYGGAI 
KDSFGGGAIY 
GDILFEGNIN 
IRGLPDKPVS 
QEGTLALSQN 
NKEETLVSAG 
TLPLPPEXII 
KTAEGMTGTP 
VGWQPTGYKL 
FSTNVWGSGL 
SQFFGKTESQ 
TDYGTLGIST 
YVRIDLPEIS 
QLAYVFDVYR 
YLSTYIAFNY 



FVG SLVASEG 
TQNLKIVKNA 
FDGSFNAIHL 
PLSAPSLIFN 
AELWLAGLKQ 
VQINMSSPTP 
PKGTKXjHSNA 
TADASIiSNIK 
NPEKQGALVL 
GWEDCQNIG 
SYKAKNDVKS 
G SWIGKGF I A 
EQGKEVRTFQ 
KGPVSLITLK 
EWREDI* I AYD 



A predicted signal peptide is highlighted. 

The cp7287 nucleotide sequence <SEQ ID 84> is 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



i 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 



ATGGTAGCGA 
CGTAATAGTA 
TACACAGCTC 
CATTCTGCTC 
TCCTGTAAAT 
TGCCTCTTAC 
AATTTCTTAG 
TATCGATCAA 
AACCCACTCA 
TCAGATACCA 
AAAAAGTGGC 
CTGCAGTAGC 
GCAAGAGATC 
TCAGTCTATC 
GTTCAGGAGC 
TCTGGGGCTG 
TAAAGGATTG 
CAGGTGTAAA 
TGTGCCACTG 
TCGTGGAGGA 
TTGCAGGTGG 
GAGAAAAATA 
TTTTGTTTAC 
CATTATCAGG 
AACTGTAGCG 
AGAGCATATA 
AAGGGACGCT 
ACTTCTAAAA 
CGAAACAATT 
GAGGTGCTTT 
GAAATTCTTT 
TTGTGGATGT 
ACATCAATAT 
GCTTCTGTTT 
CGCTTTATGG 
AATTTATAGG 
GGTGGTGGTG 
TGGAGATGTT 
ATGTAGCTCC 
AATAAAGACG 
TCCTAAAACT 
CTGTTGTTTC 
CATATCTTTA 
TGGTGGTGGT 
GAGGTGCTTT 
GTTGTTTTTT 
TATTTTAGCT 



AAAAAACAGT 
GCAATATTGT 
AGAACTAGAT 
ATGTTGAAGA 
CCCTCTCAGA 
CCAAGGAAGC 
AGCATTTTCA 
AAGCTGGTTT 
AGAACCTGAT 
AAGAGAATAG 
CTTAAAGAAG 
AGCTATTTCT 
CTCTTCAGGG 
TCTGAAAAGG 
TAATTCAGGG 
CAGTTTATTC 
AGTTTTATAT 
CATTGTTGTG 
GTTTAGACCT 
GCTGTTTTCA 
AATTCTATCC 
GTGCTGAGAA 
AGTAACAACG 
AGGAGCCATA 
CTATTGAATT 
GGGCTTACAG 
TACCTTAAGA 
CACATGGTGG 
AGCGAAGTTG 
AAGTGCAAAT 
TTGAGCAAAA 
CGATCTAATC 
TATTGGAAAC 
TAGAAGTGAT 
GGGCATAATG 
AAATATAGGT 
CGATTCTCTC 
GTTTTTAAAG 
TCAAGAAACA 
AGAAGAGCCT 
GTAGAAGAGG 
TTCGACAGAT 
TTACAGATAA 
GAAGAGTCTT 
GCTTTCTACT 
CTGATAACGT 
AAAAAAGTAG 



ACGATCTTAT 
CAGCAGGCAT 
TTAGGTGTAT 
GGCTCAAACA 
AAGAATCCGA 
TCTGGAGAGA 
GCATCTTTTT 
GGTCAGATTT 
ACAAGTAATG 
AAAAGACCTA 
TTTCATCAGA 
GAAGATCTTG 
TTTAGCATTT 
ATTCTTCATT 
CTAGGTTTTG 
TGATCGAGAT 
CTTGTGAATC 
ACCCATTGTG 
TGAAGCTTTA 
CTGCTCGCAA 
GTTGTAGGCA 
GTCCAATGGA 
AAAACACCGC 
TCCTCAGCAA 
TTCAGGAAAC 
ATTTTGTAGG 
AATAATGCAG 
AGCTATTTTA 
CCTTTAAGCA 
GATAAGGTTA 
CGAAGTGAGG 
CTAAGTTAGA 
TCCGGAGCTA 
GACACAAGCT 
TTCTTCTAGA 
GGAAGTACCT 
TACTGATAGA 
GAAACAAAGG 
GCTCCCGTGG 
TAATGCTTGT 
AAGTGCCACC 
ATTCGTGGTG 
TACAGGAAAT 
CTACTGTCGG 
AATGAAGTTA 
GACTTCAAAT 
ATATCTCCGC 



AGGTCTTCAT 
TGCTTTTGAA 
TCAATAAACA 
TCTGTTTTAA 
GAAGGTTTTG 
GTTTGGATCT 
GAAGAGACTA 
AGATACTAGG 
CTGTAAGTGA 
GAGACTGAAG 
TCTCCCTAAA 
AAATCTCAGA 
TTTTATAAAA 
TCAAGGAATT 
AAAATCTTAA 
ATTGTTTTTG 
TTTAGAAGAT 
GTGATGTAAC 
CGTCTGGTTA 
CCATGAAGTG 
ATAAAGGAGC 
GGAGCTTTTG 
CTTGTGGAAA 
GTGATATTGA 
CAGTCTCTAA 
TGGAGGAGCT 
TAGTGCAATG 
GCAGGTACTG 
GAATACAGCA 
TAATTGCAAA 
AATCACGGAG 
ACAAAAGGAT 
TCACTTTTTT 
GAAGATTATG 
TTCCAATAGT 
TCTGGATAGG 
GTGACAATTT 
CCAATGTCTT 
AATCAGATGC 
AGTCATGGAG 
TTCATTGTTA 
GTGGGGCCAT 
CTGAGATTCT 
TGATTTAGCT 
ATGTTTGCAG 
GGTTGTGATT 
GAACCACTCG 



TTTCTCATTC 
GCACATTCCT 
GTTTGAGGAA 
AGGGATCAGA 
TACACTCAAG 
CGCCGATGCT 
CAGTATTTGG 
AATTTTTCCC 
GAAAATCTCC 
ATCCTTCAAA 
AGTCCTGAAA 
AAACATTTCA 
ATACATCTTC 
ATCTTTTCTG 
GGCGCCGAAA 
AAAATCTTGT 
GGCTCTGCCG 
TCTCACTGAT 
AAGATTTTTC 
CAAAATAACC 
TATTGTTGTA 
CTTGCGGAAG 
GAAAATCAAG 
TATTCAAGGG 
TTGCTCTTGG 
TTAGCTGCTC 
TGTTAAAAAC 
TTGATCTCAA 
GCTCTAACTG 
TAACTTTGGA 
GAGCCATTTA 
TCTGGAGAGA 
AAAAAATAAG 
CTGGTGGAGG 
GGGAATATTC 
AGAATATGTC 
CTAATAACTC 
GCTCAAAAAT 
TTCATCTACA 
ATCATTATCC 
GAAGAACATC 
TCTAGCTCAA 
CTGGGAACCT 
ATCGTAGGAG 
TAACCAAAAT 
CAGGGGGAGC 
GTTGAATTTG 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



2351 
2401 
2451 
2501 
2551 
2601 
2651 
2701 
2751 
2801 
2851 
2901 
2951 
3001 
3051 
3101 
3151 
3201 
3251 
3301 
3351 
3401 
3451 
3501 
3551 
3601 
3651 
3701 
3751 
3801 
3851 
3901 
3951 
4001 
4051 
4101 
4151 
4201 
4251 
4301 
4351 
4401 
4451 
4501 
4551 
4601 
4651 
4701 
4751 
4801 



TCTCTAATGG 

TCAGTAAACA 

AACACGTCTT 

TTTGTGGAAA 

TCTGAAAATC 

AAATATTCAG 

GATCTTATGG 

AGCAACCCAC 

TGCTAAAAAT 

TTGGTGGAGG 

GGGAACGTTT 

AATTGCAGAC 

TATTTGAAGG 

TGCGGGAATG 

TATTATTTTC 

TGCCAGATAA 

TCCAAGCCAC 

TTCTCGAGGG 

CCTTAGCTTT 

GAAACAGGAA 

TGATTCCCAG 

AGACTCTTGT 

AATAAAGATA 

TACTGTAGAT 

TTCCTCCTGA 

ATAGATCTTA 

TCTTCTAAGO 1 

AAGGAATGAC 

ATAGATGTAT 

AGTTTGGTCT 

AACCTACGGG 

AATAATCTCT 

CTTTGCTCAT 

ATGTCTGGGG 

GAGTTTGATG 

TACACAACTA 

TTGGTAAAAC 

TATATGGGAG 

AGGAGCTTTT 

GTACTTTAGG 

GGCACAAGCA 

GGCAATCGTA 

TAGATCTTCC 

AAAACTCGTT 

TGCTTATTCG 

ACGTCTTTGA 

GATGCTGCTT 

TTGGAAGGCT 

CGTATTTAGC 

TTCAATGGTG 



TTCAGGGAAA 
TTACGGACAA 
GGCGGTGCTG 
TCAGGGAAAC 
AAAGATCAGG 
GATAACGCAG 
AGGTGCTATT 
GAACGCTTAC 
AGCACGCAAA 
GGCCATCTAT 
CTTTCTATGG 

GGAGGAACTG 
GAATATCAAT 
ACTCAAAAAT 
CAAGATGCAA 
AGATGTCAGT 
AAGATGACAG 
GTATCTAAAA 
ATCACAAAAC 
GTTCTATCGT 
GTTGATAGCA 
TTCTGCCGGA 
AAGCTGTAGA 
TTGTCTTCAT 
AATTATCATT 
AGATTATAGA 
TCTCATAAAG 
AGGGACGCCT 
CTTTACCTTC 
GAAAGTAAAA 
ATATAAGTTA 
GGAGTCATTA 
CATACGATAG 
ATCAGGATTA 
GGTTCAAACA 
GTTGAAGACT 
TGAAAGCCAA 
CTGCTTATGC 
GTTTACGGTA 
TATTTCAACA 
TTGATTACCG 
TCCACAGTGG 
AGAAATTAGC 
TTGAGAATGT 
CGTGGCTCAC 
TGTATATCGT 
ATTCTTGGAA 
CGCTTGAGCA 
GTTTAATTAT 
GTATCCGTAT 



TTCGGTGGTG 
TGGCTCGGCA 
GAGTTGCAGC 
ATAGCATTTA 
TGGAGGAGCT 
GAGATATCCT 
TTTGTAGGAT 
AATTACAGGC 
CAGCCGCTTC 
ACACAAAACC 
CAACAGAGCT 
TTTGTTTAGA 
TTTGATGGGA 
CGTAGAGCTT 
TTACTTATGA 
CCTTTAAGTG 
CGCTCAACAT 
TTCCTCAGAT 
GCAGAGCTTT 
ATTGTCTGCG 
GTGCGCCTCT 
GTTCAAATTA 
TACTCCAGTA 
TTGTTCCTGA 
CCTAAGGGAA 
TCCTACCAAT 
ATATTCCATT 
ACAGCAGATG 
GATCACACCA 
TGGAAGATGG 
AATCCTGAGA 
TACAGATCTT 
CTCAAAGAAT 
GGTGTTGTTG 
TCATCTCACA 
TCTTAATTGG 
TCCTACAAAG 
GGGGATTTTA 
ATATAAACAA 
GGTTCATGGA 
CTATATTGTA 
TTCCTTTTGT 
GAACAGGGTA 
CGCCATTCCT 
GTGCTGAAGT 
AAGGGACCTG 
GAGTTATGGG 
ATAATACGGA 
GAATGGAGAG 
TATTTTCTAG 



CCGTTTGCGC 
GTATCATTCT 
TCCTCAAGGC 
AAGAGAACTT 
ATCATTGCTA 
ATTTGTAAGT 
CTTTGGTTGC 
AACAGTGGGG 
TTTATCAGAA 
TCAAAATTGT 
CCTAGTGGTG 
GGCTTTTGGA 
GTTTCAATGC 
TCTGCTGTTC 
AGAGAACACA 
CCCCTTCATT 
CATGAAGGGA 
TGCTGCTATA 
GGTTGGCAGG 
GGATCTATTC 
TCCTACAGAA 
ACATGAGCTC 
CTTGCAGATA 
GCAAGACGGA 
CAAAATTACA 
GTGGGATATG 
AATTTCTCTT 
CTTCTCTATC 
GCAACGTATG 
AAGACTTGTA 
AGCAAGGGGC 
AGAGCTCTTA 
GGAGTTAGAT 
AAGATTGTCA 
GGGTATGCCC 
AGGATGTTTC 
CTAAGAACGA 
GCAGGTCCTT 
CGAOOTGACT 
TAGGAAAAGG 
AATCCTCGAC 
AGAAGCCGAG 
AAGAGGTTAG 
TTTGGATTTG 
GAACAGTGTA 
TCTCTTTGAT 
GTAGATATTC 
ATGGAATTCA 
AAGATCTGAT 



TTTAAACGAA 
CTAAAAATAG 
TCTGTAACGA 
TGTTTTTGGC 
ACTCTTCTGT 
AACTCTACGG 
TTCTGAAGGC 
ATATCCTATT 
AAAGATTCCT 
AAAGAATGCA 
CTGGTGTCCA 
GGAGATATCT 
GATTCACTTA 
AAGATAAAAA 
ATTCGTGGCT 
AATTTTTAAC 
CGATACGGTT 
CAAGAGGGAA 
ACTTAAACAG 
TCCGTATTTT 
AATAAAGAGG 
TCCTACACCC 
TCATAAGTAT 
ACTCTTCCTC 
TTCTAATGCC 
AAAATCATGC 
AAGACAGCGG 
TAATATAAAA 
GTCACACAGG 
GTCGGTTGGC 
TCTAGTTTTG 
AGCAGGAGAT 
TTCTCGACAA 
GAACATCGGA 
TAGGCTTGGA 
TCACAGTTCT 
TGTGAAGAGT 
GGTOAATAAA 
ACAGATTACG 
GTTTATCGCA 
GGTTTATATC 
TATGTCCGTA 
AACGTTCCAA 
CTTTAGAACA 
CAGCTTGCTT 
TACACTCAAG 
CTTGTAAAGC 
TATTTAAGTA 
AGCTTATGAC 



55 



The PSORT algorithm predicts an inner membrane location (0.106). 

The protein was expressed in E.coli and purified as a GST-fusion product, as shown in Figure 42A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 42B) and for FACS analysis (Figure 42C). A his-tagged protein was also expressed. 

The cp7287 protein was also identified in the 2D-PAGE experiment and showed good 
cross-reactivity with human sera, including sera from patients with pneumonitis. 

These experiments show that cp7287 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 
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Example 43 

The following Cpneumoniae protein (PID 4377105) was expressed <SEQ ID 85; cp7105>: 

1 MSLYQKWWNS QLKKSLCYST VAALIFMIPS QESFADSLID LNLGLDPSVE 

51 CLSGDGAFSV GYFTKAGSTP VEYQPFKYDV SKKTPTILSV ETANQSGYAY 

101 GISYDGTITV GTCSIjGAGKY NGAKWSADGT IVTPIjTGITGG TSHTEARAIS 

151 KDTQVIEGFS YDASGQPKAV QWASGATTVT QLADISGGSR SSYAYAISDD 

201 GTIIVGSMES TITRKTTAVK WVNNVPTYLG TLGGDASTGL YISGDGTVIV 

251 GAANTATVTN GNQESHAYMV KDNQMKD* 

The cp7105 nucleotide sequence <SEQ ID 86> is: 

1 GTGAGTCTAT ATCAAAAATG GTGGAACAGT CAGTTAAAGA AGAGCCTCTG 

51 CTATTCGACT GTTGCTGCTC TAATATTTAT GATTCCTTCT CAAGAATCCT 

101 TTGCAGATAG TCTTATAGAT TTAAATTTAG GTTTAGATCC TTCGGTCGAA 

151 TGTCTGTCAG GAGATGGTGC ATTTTCTGTT GGGTATTTTA CTAAGGCGGG 

201 ATCGACTCCC GTAGAATATC AGCCGTTTAA ATACGACGTA TCTAAGAAGA 

251 CATTCACAAT CCTTTCCGTA GAAACGGCAA ATCAGAGCGG CTATGCTTAC 

301 GGAATCTCCT ACGATGGCAC GATCACTGTA GGAACGTGTA GCCTAGGTGC 

351 AGGAAAATAT AACGGCGCAA AATGGAGTGC GGATGGCACT TTAACACCCT 

401 TAACTGGAAT CACGGGGGGG ACGTCACATA CGGAAGCGCG TGCGATTTCT 

451 AAGGATACTC AGGTGATCGA GGGTTTCTCA TATGATGCTT CAGGGCAACC 

501 CAAGGCTGTG CAGTGGGCAA GCGGAGCGAC TACAGTAACA CAATTAGCAG 

551 ATATTTCAGG AGGCTCTAGA AGCTCTTATG CGTATGCTAT ATCTGATGAT 

601 GGCACGATTA TTGTTGGGTC TATGGAGAGC ACGATAACAA GGAAAACTAC 

651 AGCTGTAAAA TGGGTAAATA ATGTTCCTAC GTATCTGGGA ACCTTAGGAG 

701 GAGATGCTTC TACAGGTCTT TATATTTCTG GAGACGGCAC CGTGATTGTA 

751 GGTGCGGCAA ATACAGCAAC TGTAACCAAT GGGAATCAGG AATCCCACGC 

801 CTATATGTAT AAAGATAACC AAATGAAAGA TTGA 

The PSORT algorithm predicts an inner membrane location (0.100). 

The protein was expressed in Rcoli and purified as a GST-fusion product, as shown in Figure 43A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 43B) and for FACS analysis (Figure 43C). A his-tagged protein was also expressed. 

This protein also showed good cross-reactivity with human sera, including sera from patients with 
pneumonitis. 

These experiments show that c P 7105 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 44 

The following Cpneumoniae protein (pid 4376802) was expressed <SEQ ID 87; cp6802>: 



1 MSMQLQPCIS LGCVSYINSF PLSLQLXKRN DIRCVLAPPA DLLNLLIEGK 
1 IaDVALTSSLG AISHNLGYVP GFG I AANQRI LSVNLYAAPT FFNSPQPRIA 



51 

101 ATLESRSSIG LLKVLCRHLW RIPTPHXLRF ITTKVLRQTP ENYDGLLLIG 

151 DAALQHPVLP GFVTYDLASG WYDLTKLPFV FALLLHSTSW KEHPLPNLAM 

201 EEALQQFESS PEEVLKEAHQ HTGLPPSLLQ EYYALCQYRL GEEHYESFEK 

251 FREYYGTLYQ QARL* 

A predicted signal peptide is highlighted. 

The cp6802 nucleotide sequence <SEQ ID 88> is: 



l 

51 
101 



ATGTCTAACC AACTCCAGCC ATGTATAAGC TTAGGCTGCG TAAGTTATAT 
TAATTCCTTT CCGCTGTCCC TACAACTCAT AAAAAGAAAC GATATTCGCT 
GTGTTCTTGC TCCCCCTGCA GACCTCCTCA ACTTGCTAAT CGAAGGGAAA 
151 CTCGATGTTG CTTTGACCTC ATCCCTAGGA GCTATCTCTC ATAACTTGGG 
<*01 GTATGTCCCC GGCTTTGGAA T0X3CAGCAAA CCAACGTATC CTCAGTGTAA 



10 
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251 AC CTCTATGC AGCTCCCACT TTCTTTAACT CACCGCAACC TCGGATTGCC 

3 01 GCAACTTTAG AAAGTCGCTC CTCTATAGGA CTCTTAAAAG TGCTTTGTCG 

351 TCATCTCTGG CGCATCCCAA CTCCTCATAT CCTAAGATTC ATAACTACAA 

401 AAGTACTCAG ACAAACCCCT GAAAATTATG ATGGCCTCCT CCTAATCGGA 

451 GATGCAGCGC TACAACATCC TGTACTTCCT GGATTTGTAA CCTATGACCT 

501 TGCCTCGGGG TGGTATGATC TTACAAAGCT ACCTTTTGTA TTTGCTCTTC 

551 TTCTACACAG CACCTCTTGG AAAGAACATC CCCTACCCAA CCTTGCGATG 

601 GAAGAAGCCC TCCAACAGTT CGAATCTTCA CCCGAAGAAG TCCTTAAAGA 

651 AGCTCATCAA CATACAGGTC TGCCCCCTTC TCTTCTTCAA GAATACTATG 

701 CCCTATGCCA GTACCGTCTA GGAGAAGAAC ACTACGAAAG CTTTGAAAAA 

7 51 TTCCGGGAAT ATTATGGAAC CCTCTACCAA CAAGCCCGAC TGTAA 



The PSORT algorithm predicts an inner membrane location (0.060). 

The protein was expressed in Ecoli and purified as a GST-fiision product, as shown in Figure 44A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
15 (Figure 44B) and for FACS analysis (Figure 44C). A his-tagged protein was also expressed. 

These experiments show that cp6802 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 45 

The following C pneumoniae protein (pid 43 7 63 90) was expressed <SEQ ID 89; cp6390>: 

20 1 MVFSYYCMGL FFFSG&ISSC GLLVSLGVQX, GI.gVLGVI,LI. LLAGLLLFKI 

51 QSMLREVPKA PDliLDLEDAS ERLRVKASRS IiASLPKEISQ LESYIRSAAN 

101 DLOTIKTWPH KDQRfcVETVS RKLERLAAAQ NYMISELCEI SEILFJ3EEHH 

151 LILAQESLEW IGKSXjFSTFI, DMESFLNLSH IiSEVRPYLAV NDPRLLtEl TE 

. 201 ESWEWSHFI NVTSAFKKAQ ILFKKNEHSR MKKKLESVQE LIiETFIYKSL 

^ 5 253 - KRSYRELGCL SEKMRIIHDN PLFPWVQDQQ KYAHAKNEFG EIARCLEEFE 

301 KTFFWLDEEC AISYMDCWDF LNESIQNKKS RVDRDYISTK KIALKDRART 

351 YAKVL.LEENP TTEGKIDLQD AQRAFERQSQ EFYTLEHTET KVRLEALQQC 

4 01 FSDLREATNV RQVRFTNSEN AWDLKESFEK IDKERVRYQK EQRLYWETID 

451 RNEQELREEI GESLRLQNRR KGYRAGYDAG RLKGLLRQWK KNLRDVEAHL 

jU 501 EDATMDFEHE VSKSELCSVR ARLEVLEEEL MDMSPKVADI EELLSYEERC 

551 ILP1RENLER AYLQYNKCSE ILSKAKFFFP EDEQLLVSEA NliREVGAQLK 

601 QVQGKCQERA QKFAIFEKHX QEQKSLXKEQ VRSFDLAGVG FLKSELLSIA 

651 CNLYIKAWK ESIPVDVPCM QLYYSYYEDN EAWRNRLIiN MTERYQNFKR 

701 SLNSIQFNGD VLLRDPVYQP EGHETRDKER ELQETTLSCK KIiKVAQDRXi S 

35 751 ELESRLSRR 

A predicted signal peptide is highlighted. 

The cp6390 nucleotide sequence <SEQ ED 90> is: 

40 



45 



50 



55 



1 


TTGGTATTCT 


51 


TTCTAGTTGT 


101 


TTTTAGGAGT 


151 


CAAAGTATGC 


201 


AGATGCAAGT 


251 


TCCCGAAGGA 


301 


GATCTAAATA 


351 


GACCGTGTCA 


401 


TTTCTGAACT 


451 


CTAATTTTGG 


501 


TACCTTTCTG 


551 


TGCGTCCGTA 


601 


GAATCTTGGG 


651 


GAAAGCTCAG 


701 


AGTTAGAAAG 


751 


AAGAGAAGTT 


801 


TCACGACAAT 


851 


ATGCTAAGAA 


901 


AAGACGTTCT 
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951 TTGGGATTTT CTAAATGAGT CTATTCAGAA TAAGAAGTCC AGAGTAGATC 

1001 GAGATTATAT ATCCACGAAG AAAATTGCAT TAAAGGATAG AGCCCGCACT 

1051 TATGCTAAGG TTCTTTTAGA AGAGAATCCG ACTACAGAGG GTAAAATAGA 

1101 TTTGCAAGAC GCTCAAAGAG CCTTTGAGCG TCAAAGTCAG GAGTTTTATA 

1151 CACTAGAGCA TACGGAAACA AAGGTGAGAC TAGAAGCACT TCAACAGTGC 

12 01 TTCTCGGATC TTAGGGAGGC GACGAACGTA AGGCAAGTTA GGTTTACAAA 
1251 TTCTGAAAAT GCGAATGATT TAAAGGAGAG TTTCGAGAAG ATAGATAAAG 

13 01 AGCGTGTGCG ATATCAAAAA GAGCAAAGGC TCTATTGGGA AACAATAGAT 
in 1351 CGCAATGAGC AAGAGCTTAG GGAAGAGATT GGGGAGTCGC TTCGTTTACA 
1U 1401 AAATCGGAGA AAAGGGTATA GGGCTGGATA TGATGCTGGG CGTTTAAAAG 

1451 GTTTGTTGCG TCAGTGGAAG AAAAATC TCC GCGATGTGGA AGCCCACCTT 

1501 GAAGATGCAA CTATGGATTT TGAGCATGAA GTAAGCAAGA GCGAATTGTG 

1551 CAGTGTTCGG GCGAGGCTCG AGGTTCTAGA AGAAGAGCTG ATGGATATGT 

i^Ol CTCCTAAAGT TGCGGATATA GAAGAGTTGT TGTCCTATGA AGAGCGTTGT 

° 1651 ATTCTTCCTA TTAGGGAAAA TTTAGAAAGG GCATACCTCC AATATAATAA 

1701 GTGTTCTGAA ATTTTATCCA AGGCAAAGTT CTTCTTTCCG GAAGACGAGC 

1751 AATTGCTAGT TTCGGAAGCG AATCTAAGAG AGGTGGGTGC CCAGTTAAAA 

1801 CAAGTACAGG GAAAATGTCA AGAGAGGGCC CAAAAGTTCG CAATATTTGA 

1851 AAAGCATATT CAGGAGCAGA AAAGCCTTAT TAAAGAGCAA GTGCGGAGTT 

ZU 1901 TTGATCTAGC GGGAGTTGGG TTTTTAAAGA GTOAGCTTCT TAGTATTGCT 

1951 TGTAACCTTT ATATAAAGGC GGTTGTTAAG GAGTCTATAC CAGTTGATGT 

2001 GCCTTGTATG CAGTTATATT ATAGTTATTA CGAAGATAAT GAAGCTGTAG 

2051 TGCGAAACCG CCTTTTAAAT ATGACGGAGA GGTATCAAAA TTTTAAAAGG 

2101 AGTTTGAATT CCATACAATT TAATGGTGAC GTTCTTTTAC GGGATCCGGT 

Z ^ 2151 CTATCAACCT GAAGGTCATG AGACCAGGCT AAAGGAACGG GAGCTACAAG 

2201 AAACAACTTT GTCTTGTAAG AAATTAAAAG TGGCTCAAGA TCGTCTTTCT 

2251 GAATTAGAGT CAAGGC TGTC TAGGAGATAG 

The PSORT algorithm predicts a periplasmic location (0.932). 

The protein was expressed in Kcoli and purified as a GST~fusion product, as shown in Figure 45A. 
30 The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 45B) and for FACS analysis (Figure 45C). A his-tagged protein was also expressed. 

This protein also showed good cross-reactivity with human sera, including sera from patients with 
pneumonitis. 

These experiments show that cp6390 is a surface-exposed and immunoaccessible protein, and that it 
35 is a useful immunogen. These properties are not evident from the sequence alone. 
Example 46 

The following C.pneumoniae protein (pid 4376272) was expressed <SEQ ID 91; cp6272>: 

1 MKRCFLFIAS FVUK5SSADA LTHQEAVKKK NSYLSHFKSV SGXVTIEDGV 

51 IiNlHNNLRIQ ANKVYVENTV GQSLKLVAHG NVMVNYRAKT LVCDYIiEYYE 

40 101 DTDSCLLTNG RFAMYPWFLG GSMITLTPET IVIKKGYIST SEGPKKDLCL 

151 SGDYLEYSSD SLI>SIGKTTL RVCRIPILFL PPFSIMPMEI PKPPINFRGG 

201 TGGFLGSY1.G MSYSPISRKH FSSTFFLDSF FKHGVGMGFN LHCSQKQVPE 

251 NVFNMKSYYA HRIxAIDMAEA HDRYRLHGDF CFTHKHVNFS GEYHLSDSWE 

301 TVADIFPNWF MUCNTGPTRV DCTWNDNYFE GYLTSSVKVN SFQNANQELP 

45 351 YLTLRQYPIS IYNTGVYLEN IVECGYIaNFA FSDHIVGENF SSLRLAARPK 

401 LHKTVPLPIG TLSSTLGSSI, 1YYSDVPEIS SRHSQLSAKL QLDYRFLLHK 

451 SYIQRRHIIE PFVTFITETR PLAKMEDHYI FSIQDAFHSL NLLKAG1DTS 

501 VLSKTNPRFP RIHAKLWTTH ILSNTESKPT FPKTACELSL PFGKKNTVSL 

551 DAEWIWKKHC WDHMNIRWEW IGNOTVAMTL ESLHRSKYSL IKCDRENFIL 

50 601 DVSRPIDQLL DSPLSDHRNL, ILGKLFVRPH PCWNYRI-SLR YGWHRQDTPN 

651 YLEYQMILGT KIFEHWQLYG VYERREADSR FFFFLKIjDKP KKPPF* 

A predicted signal peptide is highlighted. 

The cp6272 nucleotide sequence <SEQ ID 92> is: 

1 ATGAAACGTT GCTTCTTATT TCTAGCTTCC TTTGTTCTTA TGGGTrCCTC 
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10 



15 



20 



25 



30 



35 



40 



45 



51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 



AGCTGATGCT 
TTAGTCACTT 
TTGAATATCC 
AAATACTGTG 
TGAACTATAG 
GATACAGACT 
GTTTCTAGGG 
GGAAGGGATA 
TCCGGAGATT 
GACAACATTA 
CTATCATGCC 
ACAGGAGGAT 
TAGGAAGCAT 
GCGTCGGCAT 
AATGTCTTCA 
GGCAGAAGCT 
ATAAGCATGT 
ACTGTTGCTG 
CACACGTGTC 
CCTCTTCTGT 
TATTTAACAT 
CCTTGAAAAC 
ATATCGTTGG 
CTCCATAAAA 
GAGTTCTCTG 
GTCAGCTTTC 
TCCTACATTC 
AGAGACTCGT 
AAGATGCCTT 
GTACTGAGTA 
GACTACCCAC 
CTGCATGCGA 
GATGCTGAAT 
TTGGGAGTGG 
ATAGAAGCAA 
GATGTCAGCC 
TAGGAATCTC 
ATTACCGCTT 
TACCTAGAAT 
GCTCTATGGG 
TCTTAAAGCT 



TTGACTCATC 
TAAGAGTGTT 
ATAACAACCT 
GGTCAAAGCC 
GGCAAAAACC 
CTTGTCTTCT 
GGGTCTATGA 
TATCTCTACC 
ACCTGGAATA 
AGGGTGTGTC 
TATGGAGATC 
TTCTGGGATC 
TTCTCCTCGA 
GGGATTCAAC 
ATATGAAAAG 
CATGATCGCT 
AAATTTTTCT 
ACATTTTCCC 
GATTGCACTT 
TAAGGTAAAC 
TAAGGCAGTA 
ATCGTAGAAT 
CGAGAATTTC 
CTGTGCCTCT 
ATTTACTATA 
CGCGAAGCTA 
AAAGACGCCA 
CCTCTAGCTA 
TCACTCCTTA 
AGACTAACCC 
ATCTTGAGCA 
GCTATCTCTA 
GGATTTGGAA 
ATCGGAAATG 
ATACAGCCTG 
GTCCCATOGA 
ATTTTAGGGA 
ATCCTTACGC 
ACCAGATGAT 
GTGTATGAAC 
CGACAAACCT 



AAGAGGCTGT 
TCTGGGATTG 
GCGGATACAA 
TGAAGCTTGT 
CTAGTTTGTG 
TACTAATGGA 
TCACTCTAAC 
TCCGAGGGTC 
TTCTTCAGAT 
GCATTCCGAT 
CCTAAGCCTC 
CTATTTGGGG 
CATTTTTCTT 
CTCCATTGTT 
CTATTATGCC 
ATCGCCTACA 
GGAGAATACC 
CAACAACTTC 
GGAATGACAA 
TCTTTCCAAA 
CCCGATTTCT 
GTGGGTATTT 
TCTTCACTAC 
ACCTATAGGA 
GCGATGTTCC 
CAACTTGACT 
TATTATAGAG 
AGAATGAAGA 
AACCTTCTGA 
TCGATTCCCG 
ATACAGAAAG 
CCTTTTGGAA 
AAAGCACTGT 
ACAATGTGGC 
ATTAAGTGTG 
CCAGCTTTTA 
AATTATTTGT 
TATGGCTGGC 
TCTAGGGACG 
GCCGAGAAGC 
AAAAAACCTC 



GAAAAAGAAA 
TGACCATCGA 
GCCAATAAAG 
CGCACATGGC 
ATTACCTAGA 
AGATTCGCGA 
CCCAGAAACC 
CCAAAAAAGA 
AGTCTTCTTT 
ACTTTTCTTA 
CGATAAACTT 
ATGAGCTACT 
GGATAGCTTT 
CTCAGAAGCA 
CACCGCCTTG 
CGGAGATOTC 
ATCTCAGCGA 
ATGTTGAAAA 
CTATTTTGAA 
ATGCCAACCA 
ATTTATAATA 
AAACTTTGCT 
GTCTTGCTGC 
ACGCTCTCCT 
TGAGATCTCC 
ATCGCTTTCT 
CCGTTCGTTA 
TCATTATATC 
AAGCGGGTAT 
AGAATCCATG 
CAAACCCACG 
AGAAAAATAC 
TGGGATCACA 
TATGACTCTA 
ACAGGGAGAA 
GACTCCCCTC 
ACGACCTCAT 
ATCGCCAGGA 
AAGATCTTCG 
AGATAGTCGA 
CCTTCTAA 



AACTCCTATC 
AGATGGGGTA 
TGTATGTAGA 
AATGTTATGG 
GTATTACGAA 
TGTATCCTTG 
ATAGTCATTC 
CCTGTGCCTC 
CTATAGGGAA 
CCTCCATTTT 
TCGAGGAGGA 
CGCCGATTTC 
TTCAAGCATG 
GGOTCCTGAG 
CTATCGATAT 
TGCTTCACGC 
TAGTTGGGAA 
ATACAGGCCC 
GGGTATCTCA 
AGAGCTCCCT 
CGGGAGTGTA 
TTTAGCGATC 
GCGCCCTAAG 
CCACCCTAGG 
TCGCGCCATA 
ATTACATAAG 
CCTTCATTAC 
TTTTCTATTC 
AGATACCTCG 
CGAAGCTGTG 
TTTCCCAAAA 
AGTCTCCTTA 
TGAACATACG 
GAATCCCTGC 
CTTCATTTTA 
TCTCTGATCA 
CCCTGTTGGA 
CACTCCGAAC 
AACATTGGCA 
TTTTTCTTCT 



The PSORT algorithm predicts an outer membrane location (0.48). 

The protein was expressed in Kcoli and purified as a GST-fusion product, as shown in Figure 46A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot and for 
FACS analysis (Figure 46B). A his-tagged protein was also expressed. 

This protein also showed good cross-reactivity with human sera, including sera from patients with 
pneumonitis. 

These experiments show that c P 6272 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone, 

50 Example 47 

The following C.pneumoniae protein (pid 4377111) was expressed <SEQ ID 93; cp71 1 1>: 



55 



l 

51 
101 
151 
201 



MFEAVIADIQ AREILDSRGY 
LEFRDTDSPR YQGKGVLQAV 
GSPNKETLGA HAILGVSLAT 
LINGGMHADN GLEpQEFMIR 
STGVGDEGGF APNLASNEEA 



PTLiHVKVTT S TGSVGEARVP SGASTGKKEA 
KNVKEILFPL VKGCSVYEQS LIDSLMMDSD 
AHAAAATLRR PLYRYLGGCF ACSLPCPMMN 
PIGASSIKEA VWMGADVFHT LKKLLHERGL 
LELLLLAIEK AGFTPGKDIS LALDCAASSF 



8NSDOC1D: <WO 0202606A2_I_> 



WO 02/02606 



PCT/IB01/01445 



-87- 



251 YNVKTGTYDG RHYEEQIAIL SNLCDRYPID SIEDGLAEED YDGWALLTEV 

3 01 LGEKVQIVGD DLFVTNPELI LEGISNGLAN SVLIKPNQIG TLTETVYAIK 

351 LAQMAGYTTI ISHRSGETTD TTIADLAVAF NAGQIKTGSL SRSERVAKYN 

401 RLiMEIEEELG SEAIFTDSNV FSYEDSEE* 

A predicted signal peptide is highlighted. 

The cp71 1 1 nucleotide sequence <SEQ ID 94> is: 

1 ATGTTTGAAG CTGTCATTGC CGATATCCAG GCTAGGGAAA TCTTCGATTC 

51 TCGCGGGTAT CCCACTTTAC ATGTTAAAGT AACCACTAGC ACAGGTTCTG 

101 TTGGAGAAGC TCGGGTTCCT TCAGGAGCAT CCACAGGGAA AAAAGAAGCC 

151 TTAGAGTTTC GTGATACAGA TTCTCCTCGT TATCAAGGCA AAGGGGTTTT 

2 01 GCAAGCTGTA AAAAACGTAA AAGAAATTCT TTTTCCCCTC GTCAAGGGAT 
251 GTAGTGTTTA TGAGCAATCC TTAATTGATT CTCTGATGAT GGATTCTGAC 

3 01 GGCTCTCCGA ACAAAGAAAC TCTAGGGGCC AATGCTATTT TAGGAGTCTC 
351 TCTAGCTACA GCACATGCAG CAGCAGCAAC ACTACGCAGA CCTCTGTATC 
401 GTTATTTAGG AGGGTGTTTT GCCTGCAGTC TTCCCTGTCC TATGATGAAT 
451 CTGATCAATG GAGGCATGCA TGCCGATAAC GGCTTGGAGT TCCAAGAATT 
501 TATGATCCGT CCTATTGGAG CCTCTTCCAT CAAAGAAGCT GTCAACATGG 
551 GTGCTGACGT TTTTCATACT TTGAAAAAAT TACTCCATGA AAGAGGCTTA 
601 TCTACTGGAG TGGGTGACGA AGGAGGCTTC GCCCCGAATC TTGCTTCTAA 
651 TGAAGAAGCT CTAGAGCTCC TATTGCTGGC TATTGAAAAA GCAGGCTTTA 
701 CTCCAGGAAA AGATATATCG CTAGCCTTAG ACTGCGCAGC ATCCTCATTC 
751 TATAACGTAA AAACAGGCAC GTATGATGGG AGGCACTATG AAGAGCAAAT 
801 CGCAATCCTT TCTAATOTAT GTGATCGCTA TCCTATAGAC TCCATAGAAG 
851 ATGGTCTTGC TGAAGAAGAC TATGACGGGT GGGCCTTGST AACTGAAGTT 
901 CTTGGAGAAA AAGTACAGAT TGTGGGTGAT GACCTATTTG TTACAAATCC 
951 GGAATTAATA TTAGAGGGTA TTAGCAATGG ATTAGCGAAC TCTGTGTTGA 

1001 TTAAACCAAA TCAGATAGGG ACGCTTACTG AAACAGTGTA TGCTATCAAG 

1051 CTTGCGCAAA TGGCTGGCTA TACTACAATT ATTTCTCATC GCTCAGGAGA 

1101 AACTACGGAC ACTACGATTG CAGATCTTGC TGTTGCCTTC AACGCCGGTC 

1151 AAATCAAAAC AGGCTCTTTA TCACGTTCTG AGCGTGTTGC AAAATACAAT 

1201 AGACTCATGG AAATTGAAGA AGAGCTTGGA TCCGAAGCAA TTTTCACAGA 

1251 TTCTAATGTA TTTTCTTAC GAGGATTCT GAGGAATAG 

The PSORT algorithm predicts an inner membrane location (0.100). 

The protein was expressed in Kcoli and purified as a GST-fusion product, as shown in Figure 47 A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 47B) and for FACS analysis (Figure 47C). A his-tagged protein was also expressed. 

The cp7111 protein was also identified in the 2D-PAGE experiment and showed good 
cross-reactivity with human sera, including sera from patients with pneumonitis. 

These experiments show that cp71 1 1 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 48 



The following C.pneumoniae protein (pid 4455886) was expressed <SEQ ID 95; cp0010>: 

1 MKSQFSWLVL SSTIACFTSC STVFA ATAEN IGPSDSFDGS TNTGTYTPKN 

51 TTTGIDYTLT GDITLQNLGD SAALTKGCFS DTTESLSFAG KGYSLSFLNI 

101 KSSAEGAALS VTODKNLSLT GFSSLTFLAA PSSVITTPSG KGAVKCGGDL 

151 TFDWWGTILF KQDYCEENGG AISTKNLSLK NSTGSISFEG NKSSATGKKG 

201 GAICATGTVD ITNNTAPTLF SNNIAEAAGG AINSTGWCTI TONTSLVFSE 

251 NSVTATAGNG GALSGDADVT ISGNQSVTFS GNQAVANGGA XYAKKLTLAS 

3 01 GGGGVSPFLT IIVOGTTAGN GGAISILAAG EC SIiSAEAGD ITFNGNAIVA 

351 TTPQTTKRNS IDIGSTAKIT NLRAISGHSI FFYDFITAOT AADSTDTLNL 

401 WKADAGNSTD YSGSIVFSGE KIiSEDEAKVA DNLTSTLKQP VTXiTAGNLVX, 

451 KRGVTLDTKG FTQTAGSSVI MDAGTTZjKAS TEEVTLTGLS 1PVDSLGEGK 

501 KWIAASAAS KNVALSGPIL LLDNQGNAYE NHDLGKTQDF SFVQLSALGT 
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551 
601 
651 
701 
751 
801 
851 
901 



ATTTDVPAVP TVATPTHYGY 
LPNPERQGPL VPNSLWGSFS 
LDKDKKGEKR KYRHKSGGYA 
KNHTDTYAGA FYIQHITECS 
SNDLKTKYTA YPEVKGSWGN 
NLTYIRQDSF SEKGTEGRSF 
LSYVPDLIRN DPKCTTALVI 
FEVLGQFVFE VRGSSRIYNV 



-88- 

QGTWGMTWVD 
DIQAIQGVIE 
XGGAAQTCSE 
GFIGCLLDKL 
NAFNMMLGAS 
DDSNLFNLSL 
SGASWETYAH 
DLGGKFQF* 



DTASTPKTKT 
RSALTLCSDR 
NLISFAFCQL 
PGSWSHKPLV 
SHSYPEYLHC 
PIGVKFEKFS 
NLARQALQVR 



ATLAWTNTGY 
GFWAAGVANF 
FGSDKDFLVA 
LEGQLAYSHV 
FDTYAPYIKL 
DCNDFSYDLT 
AG SHYAFS PM 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



A predicted signal peptide is highlighted. 
The cpOOlO nucleotide sequence <SEQ ID 96> is: 



i 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 
2701 



ATGAAATCGC AATTTTCCTG 
TACTAGTTGT TCCACTGTTT 
CTGATAGCTT TGACGGAAGT 
ACGACTACTG GAATAGACTA 
CCTTGGGGAT TCGGCAGCTT 
AATCTTTAAG CTTTGCCGGT 
AAGTCTAGTG CTCAAGGCGC 
GTCGCTAACA GGATTTTCGA 
TAATCACAAC CCCCTCAGGA 
ACATTTGATA ACAATGGAAC 
AAATGGCGGA GCCATTTCTA 
GATCGATTTC TTTTGAAGGG 
GGGGCTATTT GTGCTACTGG 
TACCCTCTTC TCGAACAATA 
GCACAGGAAA CTGTACAATT 
AATAGTGTGA CAGCGACCGC 
CGATGTTACC ATATCTGGGA 
CTGTAGCTAA TGGCGGAGCC 
GGGGGGGGGG GGGTATCTCC 
TGCAGGTAAT GGTGGAGCCA 
TTTCAGCAGA AGCAGGGGAC 
ACTACACCAC AAACTACAAA 
AAAGATCACG AATTTACGTG 
ATC CGATTAC TGCTAATACG 
AATAAGGCTG ATGCAGGTAA 
TTCTGGTGAA AAGCTCTCTG 
CTTCTACGCT GAAGCAGCCT 
AAACGTGGTG TCACTCTCGA 
CTCTGTTATT ATGGATGCGG 
TCACTTTAAC AGGTCTTTCC 
AAAGTTGTAA TTGCTGCTTC 
TCCGATTCTT CTTTTGGATA 
TAGGAAAAAC TCAAGACTTT 
GCAACAACTA CAGATGTTCC 
CTATGGGTAT CAAGGTACTT 
GCACTCCAAA GACTAAGACA 
CTTCCGAATC CTGAGCGTCA 
ATCTTTTTCA GACATCCAAG 
TGACTCTTTG TTCAGATCGA 
TTAGATAAAG ATAAGAAAGG 
TGGATATGCT ATCGGAGGTG 
GCTTTGCCTT TTGCCAACTC 
AAAAATCATA CTGATAC CTA 
AGAATGTAGT GGGTTCATAG 
GGAGTCATAA ACCCCTCGTT 
AGTAATGATC TGAAGACAAA 
TTGGGGGAAT AATGCTTTTA 
ATCCTGAATA CCTGCATTGT 
AATCTGACCT ATATACGTCA 
AAGATCTTTT GATGACAGCA 
TGAAGTTTGA GAAGTTCTCT 
TTATCCTATG TTCCTGATCT 
ACTTGTAATC AGCGGAGCCT 
GACAGGCCTT GCAAGTGCGT 
TTTGAAGTGC TCGGCCAGTT 



GTTAGTGCTC 
TTGCTGCAAC 
ACTAACACAG 
TACTCTGACA 
T AAC G AAGGG 
AAGGGGTACT 
AGCACTTTCT 
GTCTTACTTT 
AAAGGTGCAG 
TATTTTATTT 
CCAAGAATCT 
AATAAATCGA 
TACTGTAGAT 
TTGCTGAAGC 
ACAGGGAATA 
AGGAAATGGA 
ATCAGAGTGT 
ATTTATGCTA 
TTTTCTAACA 
TTTCTATACT 
ATTACCTTCA 
AAGAAATTCT 
CAATATCTGG 
GCTGCGGATT 
TAGTACAGAT 
AAGATGAAGC 
GTAACTCTAA 
TACGAAAGGC 
GCACAACGTT 
ATTCCTGTAG 
TGCAGCAAGT 
ACCAAGGGAA 
TCATTTGTGC 
AGCGGTTCCT 
GGGGAATGAC 
GCGACATTAG 
AGGAC CTTTA 
CGATTCAAGG 
GGCTTCTGGG 
GGAAAAACGC 
CAGCGCAAAC 
TTTGGTAGCG 
TGCAGGAGCC 
GTTGTCTCTT 
TTAGAAGGGC 
GTATACTGCG 
ACATGATGTT 
TTTGATACCT 
GGACAGCTTC 
ACCTCTTCAA 
GATTGTAATG 
TATCCGCAAT 
CTTGGGAAAC 
GCAGGCAGTC 
TGTCTTTGAA 



TCTTCGACAT TGGCATGTTT 
TGCTGAAAAT ATAGGCCCCT 
GCACCTATAC TCCTAAAAAT 
GGAGATATAA CTCTGCAAAA 
TTGTTTTTCT GACACTACGG 
CACTTTCTTT TTTAAATATT 
GTTACAACTG ATAAAAATCT 
CTTAGCGGCC CCATCATCGG 
TTAAATGTGG AGGGGATCTT 
AAACAAGATT ACTGTGAGGA 
TTCTTTGAAA AACAGCACGG 
GCGCAACAGG GAAAAAAGGT 
ATTACAAATA ATACGGCTCC 
TGCAGGTGGA GCTATAAATA 
CGTCTCTTGT ATTTTCTGAA 
GGAGCTCTTT CTGGAGATGC 
AACTTTCTCA GGAAACCAAG 
AGAAGCTTAC ACTGGCTTCC 
ATAaTAGTCC AAGGTACCAC 
GGCAGCTGGA GAGTGTAGTC 
ATGGGAATGC CATTGTTGCA 
ATTGACATAG GATCTACTGC 
GCATAGCATC TTTTTCTACG 
CTACAGATAC TTTAAATCTC 
TATAGTGGGT CGATTGTTTT 
AAAAGTTGCA GACAACCTCA 
CTGCAGGAAA TTTAGTACTT 
TTTACTCAGA CCGCGGGTTC 
AAAAGCAAGT ACAGAGGAGG 
ACTCTTTAGG CGAGGGTAAG 
AAAAATGTAG CCCTTAGTGG 
TGCTTATGAA AATCACGACT 
AGCTCTCTGC TCTGGGTACT 
ACAGTAGCAA CTCCTACGCA 
TTGGGTTGAT GATACCGCAA 
CTTGGACCAA TACAGGCTAC 
GTTCCTAATA GCCTTTGGGG 
TGTCATAGAG AGAAGTGCTT 
CTGCGGGAGT CGCCAATTTC 
AAATACCGTC ATAAATCTGG 
TTGTTCTGAA AACTTAATTA 
ATAAAGATTT CTTAGTCGCT 
TTCTATATCC AACACATTAC 
AGATAAACTT CCTGGCTCTT 
AGCTCGCTTA TAGCCACGTC 
TATCCTGAGG TGAAAGGTTC 
GGGAGCTTCT TCTCATTCTT 
ATGCTC CATA CATCAAACTG 
TCGGAGAAAG GTACAGAAGG 
TTTATCTTTG CCTATAGGGG 
ACTTTTCTTA TGATCTGACS* 
GATCCCAAAT GCACTACAGC 
TTATGCCAAT AAC^TAGCAC 
ACTACGCCTT CTCTCCTATG 
GTTCGTGGAT CCTCACGGAT 
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2751 TTATAATGTA GATCTTGGGG GTAAGTTCCA ATTCTAG 

The PSORT algorithm predicts an outer membrane location (0.922). 

The protein was expressed in Rcoli and purified as a GST-fusion product, as shown in Figure 48A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
5 (Figure 48B) and for FACS analysis (Figure 48C). A bis-tagged protein was also expressed. 

The cpOOlO protein was also identified in the 2D-PAGE experiment and showed good 
cross-reactivity with human sera, including sera from patients with pneumonitis. 

These experiments show that cpOOlO is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

10 Example 49 

The following C.pneumomae protein (i id 4376296) was expressed <SEQ ID 97; cp6296>: 

1 MEEVSEYIiQQ VENQLESCSK RLTKMETFAL GVRLEAKEEI ESIILSDWN 
51 RFEVLCRDIE DMLSRVEEIE KMLRMAELPL LPIKEALTKA FVQHNSCKEK 
15 LTKVEPYFKE SPAYLTSEER LQSLNQTLQR AYKESQKVSG LESEVRACRE 

QIjKDQVRQPE TQGVSLIKEE ILFVTSTFRT KFSYHSFRLH VPCMRLYEEY 
201 YDDIDLERTR ARWMAMSERY RDAFQAFQEM LKEGLVEEAQ ALRETEYWLY 
2 5 1 REERKSKKKH* 

The cp6296 nucleotide sequence <SEQ ID 98> is: 

on J ATGG AGGAGG TGTCTGAGTA TCTTCAGCAA GTAGAAAATC AGTTGGAATC 

ZU 51 CTGTTCCAAG CGATTAACCA AGATGGAAAC TTTTGCCTTA GGTGTGAGGT 

101 TGGAAGCTAA AGAAGAGATA GAGTCTATCA TACTTTCTGA TGTAGTGAAC 

151 CGTTTTGAGG TTTTATGTAG AGATATTGAA GATATGCTAT CTCGAGTCGA 



GGAGATAGAG CGGATGTTAC GTATGGCGGA GCTTCCTCTA CTTCCTATAA 
AAGAAGCGCT TACCAAGGCT TTTGTACAAC ATAACAGCTG TAAAGAGAAG 
TTAACCAAGG TAGAGCCTTA CTTTAAAGAG AGCCCTGCAT ATCTAACTAG 
351 TGAAGAGCGA TTGCAGAGTT TGAATCAGAC TTTACAACGT GCGTACAAAG 
401 AGTCCCAAAA GGTTTCAGGT TTAGAATCGG AAGTGAGAGC CTGTCGAGAG 



9 < AAGAAGCGCT TACCAAGGCT TTTGTACAAC ATAACAGCTG TAAAGAGAAG 

351 
401 

451 CAGCTTAAAG ATCAAGTAAG ACAGTTTGAA ACTCAAGGAG TGAGCTTCAT 

qa 5 ZZ AAAAGAAGA G ATTCTCTTTG TGACTAGTAC CTTTAGAACT AAATTTAGCT 

™ ^ ATCATTCATT TCGATTACAT GTTCCTTGCA TGAGGTTGTA TGAGGAGTAT 

601 TATGATGACA TTGATCTAGA GAGAACTCGA GCTCGATGGA TGGCGATGTC 

651 TGAGAGGTAT AGAGATGCTT TTCAGGCATT CCAGGAGATG TTGAAGGAAG 

701 GCCTAGTTGA AGAAGCTCAG GCTCTTAGAG AAACCGAGTA CTGGTTATAT 

751 CGAGAGGAGA GAAAGAGTAA AAAGAAACAT TGA 

35 The PSORT algorithm predicts a cytoplasmic location (0.523). 

The protein was expressed in Kcoli and purified as a GST-fusion product, as shown in Figure 49A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 49B) and for FACS analysis (Figure 49C). A his-tagged protein was also expressed. 

These experiments show that cp6296 is a surface-exposed and immunoaccessible protein, and that it 
40 is a useful immunogen. These properties are not evident from the sequence alone. 

Example 50 

The following C.pneumoniae protein (pid 4376664) was expressed <SEQ ID 99; cp6664>: 

1 MVLFHAQASG RNRVKADAIV LPFWHPKDAK NAASFEAEFE PSYLPALENF 
A * QGKTGEIELL YSSPKAKEKR XVtitiGLGKNE ELTSDWFQT Y/ATLTRVLRK 

101 AKCSTVNIIL PTISEIiRIiSA EEFLVGLSSG ILSLJJJYDYPR YNKVDRNLET 
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151 PLSKVTVIGI VPKMA0AIFR KEAAIFEGVY LTODLVNRNA DEITPKKLAE 

201 VALNLGKEFP SIDTKVLGKD AIAKEKMGLL LAVSKGSCVD PHFIWRYQG 

251 RPKSKDHTVL IGKGVTFDSG GLDLKPGKSM LTMKEDMAGG ATVLGILSAL 

3 01 AVLELPINVT GIIPATENAI DGASYKMGDV YVGMSGLSVE ICSTDAEGRL 

351 ILADAITYAL KYCKPTRIID FATLTGAMW SLGEEVAGFF SNNDVLAEDL 

401 LEAS AET SEP LWRLPLVKKY DKTLHSDIAD MKNLGSNRAG AITAALFLQR 

451 FLEES SVAWA HLDIAGTAYH EKEEDRYPKY ASGFGVRSIL YYLENSLSK* 

The cp6664 nucleotide sequence <SEQ ID 100> is: 

1 GTGGTTTTAT TTCATGCTCA AGCCTCTGGG CGTAATCGTG TTAAGGCAGA 

51 TGCTATAGTC CTGCCCTTTT GGCATTTTAA GGATGCAAAA AATGCAGCTT 

101 CTTTTGAAGC CGAGTTTGAA CCCTCGTATC TCCCCGCTTT AGAAAACTTT 

151 CAAGGAAAAA CCGGGGAGAT TGAACTCCTT TATAGTAGTC CTAAAGCTAA 

201 GGAAAAACGC ATTGTCCTCT TAGGCTTAGG GAAAAATGAA GAGCTCACCT 

251 CTGATGTTGT TTTCCAAACC TATGCGACAC TAACTCGTGT CTTACGTAAA 

3 01 GCAAAGTGTT CCACAGTCAA TATCATCTTA CCTACAATTT CTGAATTGCG 

351 GCTTTCTGCC GAAGAATTCT TAGTGGGGTT GTCCTCAGGA ATTTTGTCAT 

401 TAAACTATGA CTACCCACGT TATAATAAGG TAGATCGTAA TCTTGAAACT 

451 CCTCTTTCTA AAGTCACGGT TATCGGTATC GTTCCCAAAA TGGCGGATGC 

501 TATCTTTAGG AAAGAAGCAG CCATTTTCGA AGGCGTATAT CTCACTCGAG 

551 ATCTTGTGAA CAGGAATGCT GATGAAATTA CCCCTAAGAA ATTGGCAGAG 

601 GTTGCTCTGA ATCTGGGAAA AGAGTTCCCT AGTATTGATA CTAAGGTCTT 

651 GGGAAAAGAT GCCATCGCCA AAGAGAAAACT GGGACTCCTA TTGGCTGTTT 

701 CCAAGGGTTC TTGTGTGGAT CCACACTTTA TCGTTGTCCG TTATCAAGGA 

751 CGTCCTAAGT CTAAAGATCA CACCGTCTTG ATAGGGAAAG GGGTCACTTT 

801 TGACTCTGGA GGTTTAGACC TCAAGCCTGG AAAATCCATG CTTACTATGA 

851 AAGAAGACAT GGCAGGTGGG GCTACAGTCC TCGGGATTCT CTCGGCGTTA 

901 GCAGTTTTAG AGCTTCCTAT AAATCTCACG GGGATCATTC CTGCTACAGA 

951 GAATGCTATC GATGGCGCCT CCTATAAAAT GGGAGATGTC TATGTAGGAA 

1001 TGTCGGGGCT TTCTGTTGAG ATTTGTAGTA CCGATGCTGA GGGACGTCTT 

1051 ATCCTCGCTG ATGCGATTAC ATATGCTTTA AAATATTGTA AACCGACACG 

1101 TATTATAGAT TTTGCAACTC TAACAGGAGC TATGGTAGTC TCTCTAGGAG 

1151 AAGAGGTTGC AGGTTTCTTT TCCAATAACG ATGTTTTAGC TGAAGATCTT 

12 01 TTAGAGGCGT CAGCCGAAAC CTCCGAGCCG TTATGGAGAC TTCCTCTAGT 

1251 TAAGAAGTAT GATAAAACAT TGCATTCTGA TATTGCTGAT ATGAAAAATC 

1301 TAGGCAGTAA CCGTGCAGGG GCTATTACAG CAGCATTATT CTTGCAGAGA 

1351 TTTTTGGAAG AATCTTCGGT AGCTTGGGCA CATCTTGATA TTGCAGGTAC 

1401 TGCATATCAT GAAAAAGAAG AAGACCGTTA TCCAAAATAT GCTTCAGGTT 

1451 TTGGTGTTCG TTCTATTCTT TATTACTTAG AAAATAGTCT TTCTAAGTAG 

The PSORT algorithm predicts an inner membrane location (0.268). 

The protein was expressed in Rcoli and purified as a GST-fusion (Figure 50A), as a his-tagged 
protein, and as a GST/His fusion. The proteins were used to immunise mice, whose sera were used in 
Western blot Western blot (SOB) and FACS (50C) analyses. 

The cp6664 protein was also identified in the 2D-PAGE experiment (Cpn0385) and showed good 
cross-reactivity with human sera, including sera from patients with pneumonitis. 

These experiments show that cp6664 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 51 



The following ^pneumoniae protein (pid 4376696) was expressed <SEQ ID 101; cp6696>; 

1 MTLIFVIIIV WCNAFLIKL C VIMGLQSRLQ HCIEVSQNSN FDSQVKQFIY 

51 ACQDKTLRQS VLKIFRYHFL LKIHDIARAV YLLMALEEGE DLGLSFLNVQ 

101 QYPSGAVELF SCGGFPWKGL PYPAEHAEFG LLLLQIAEFY EESQAYVSKM 

151 SHFQQALFDH QGSVFPSLWS QEKTSRLLKEK TTLSQSFLFQ LGMQIHPEYS 

201 LEDPALGFWM QRTRSSSAFV AASGCQSSLG AYSSGDVGVI AYGPCSGDIS 

251 DCYYFGCCGI AKEFVCQKSH QTTEISFLTS TGKPHPRNTG FSYLRDSYVH 

3 01 LPIRCKXTIS DKQYRVHAAL AEATSAMTFS 1FCKGKNCQV VDGPRLRSCS 
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351 LDSYKGPGND IMIIiGENDAI NIVSASPYME IFALQGKEKF WNADFLINIP 
401 YKEEGVMLIF EKKVTSEKGR FFTKMN* 

A predicted signal peptide is highlighted. 

The cp6696 nucleotide sequence <SEQ ED 102> is: 

1 TTGACTCTAA TTTTTGTTAT TATTATCGTT TGGTGCAATG CTTTTCTGAT 

51 CAAATTGTGC GTGATAATGG GGCTGCAATC CAGGTTACAA CATTGTATAG 

101 AAGTGTCCCA GAATTCGAAC TTTGATTCAC AAGTAAAACA GTTTATCTAT 

151 GCGTGCCAAG ATAAGACATT AAGGCAGTCT GTACTCAAGA TTTTCCGCTA 

201 CCATCCTTTA CTAAAAATTC ATGATATTGC TCGGGCCGTC TATCTTTTGA 

251 TGGCCTTAGA AGAAGGCGAG GATTTAGGCT TAAGCTTTTT AAATGTACAG 

3 01 CAGTACCCTT CAGGTGCTGT AGAACTGTTT TCTTGTGGGG GATTTCCTTG 

351 GAAAGGATTA CCTTATCCTG CAGAACATGC GGAATTTGGC CTACTCCTGT 

401 TACAGATCGC AGAGTTTTAT GAAGAGAGTC AGGCATACGT CTCTAAAATG 

451 AGTCATTTTC AACAGGCACT CTTTGATCAC CAAGGGAGCG TCTTTCCCTC 

501 TCTCTGGAGC CAGGAGAACT CTCGACTCCT AAAAGAAAAG ACAACTCTTA 

551 GCCAATCGTT TCTCTTCCAA TTAGGAATGC AAATTCACCC AGAATACAGT 

601 CTTGAGGATC CTGCACTAGG GTTCTGGATG CAAAGAACGC GTTCTTCATC 

651 CGCTTTTGTA GCCGCTTCAG GATGTCAAAG TAGCTTGGGA GCGTATTCCT 

701 CAGGGGATGT CGGTGTTATC GCTTATGGAC CTTGCTCTGG AGACATTAGT 

751 GATTGTTATT ATTTTGGATG TTGTGGAATC GCTAAAGAGT TCGTGTGCCA 

801 AAAATCTCAC CAAACTACAG AGATTTCTTT TCTCACCTCT ACAGGAAAGC 

851 CTCATCCCAG AAAT AC GGG A TTTTCCTACC TTCGAGATTC CTATGTACAT 

901 CTGCCGATCC GCTGTAAGAT CACTATTTCC GACAAGCAAT ATCGCGTGCA 

951 CGCTGCGTTG GCTGAGGCCA CCTCTGCCAT GACGTTTTCT ATTTTCTCTA 

1001 AGGGGAAGAA TTGTCAGGTT GTTGACGGCC CTCGCTTGCG CTCCTGTTCC 

1051 CTAGATTCTT ATAAAGGTCC CGGAAACGAC ATTATGATTC TTGGGGAAAA 

1101 TGACGCAATC AACATTGTTT CTGCAAGTCC CTATATGGAA ATTTTTGCTT 

1151 TGCAAGGCAA AGAAAAATTT TGGAATGCAG ACTTTTTGAT TAATATTCCT 

1201 TACAAAGAAG AGGGCGTCAT GTTAATTTTT GAAAAAAAAG TGACCTCTGA 

1251 GAAAGGAAGA TTCTTTACGA AGATGAATTA A 

The PSORT algorithm predicts an inner membrane location (0.463). 

The protein was expressed in Exoli and purified as a GST-fusion product, as shown in Figure 51 A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 5 IB) and for FACS analysis (Figure 51C). A his-tagged protein was also expressed. 

This protein also showed good cross-reactivity with human sera, including sera from patients with 
pneumonitis. 

These experiments show that cp6696 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 52 

The following Cpneumoniae protein (pid 4376790) was expressed <SEQ ID 103; cp6790>: 



i 

51 
101 



MSEHKKSSKI IGIDLGTTNS CVSVMEGGQA KVITSSEGTR TTPSIVAFKG 
NEKXiVG I PAK RQAVTNPEKT LGSTKRFIGR KYSEVASEIQ TVPYTVTSGS 
KGDAVFEVDG KQYTPEEIGA QILUKMKETA EAYLGETVTE AVITVPAYFN 
151 DSQRASTKDA GRIAGLDVKR IIPEPTAAAL AYG 1 DKVGDK KIAVFDLGGG 
201 TFDISILEIG DGVFEVLSTN GDTLLGGDDF DEVIIKWMIE EFKKQEGIDL 
251 SKDNMALQRL KDAAEKAKIE LSGVSSTEIN QPFITMDAQG PKHIiALTLTR 
301 AQFEKLAASL IERTKSPCIK ALSDAKLSAK DIDDVLLVGG MSRMPAVQET 
351 VKELFGKEPN KGVNPDEWA IGAAIQGGVL GGEVKDVLLL, DVIPLSLGIE 
401 TLGGVMTTIjV ERNTTIPTQK KQIFSTAADN QPAVTIWLQ GERPMAKDNK 
451 EIGRFDLTDI PPAPRGHPQI EVSFDI DANG IFHVSAKDVA SGKEQKIRIE 
501 ASSGLQEDEI QRMVRDAEIN KEEDKKRREA SDAKNEADSM IFRAEKAIKD 
551 YKEQIPETLV KEIEERIEW RNALKDDAPX EKIKEVTEDL SKHMQKIGES 
601 MQSQSASAAA SSAANAKGGP NINTEDLKKH SFSTKPPSm GSSEDHXEEA 
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651 DVEIIDNDDK* 



t)51 UVEIXDNDDK* 

The cp6790 nucleotide sequence <SEQ ID 104> is: 



1 ATGAGTGAAC ACAAAAAATC AAGCAAAATT ATAGGTATAG ACTTAGGCAC 

51 AACAAACTCC TGCGTATCTG TTATGGAAGG AGGACAAGCT AAAGTAATTA 

101 CATCATCCGA AGGAACAAGA ACCACGCCAT CGATCGTTGC CTTCAAAGGT 

151 AATGAGAAAT TAGTGGGGAT TCCAGCAAAA CGTCAAGCAG TGACAAATCC 

201 AGAAAAAACT CTCGGCTCTA CAAAACGCTT TATTGGCCGT AAGTACTCTG 

251 AAGTAGCTTC GGAAATCCAA ACCGTTCCTT ATACAGTCAC CTCCGGATCT 

301 AAAGGTGATG CCGTTTTCGA AGTTGATGGC AAACAATACA CTCCAGAAGA 

1U 351 AATTGGCGCA CAAATCTTAA TGAAAATGAA AGAGACAGCA GAAGCTTATC 

401 TAGGCGAAAC TGTCACAGAA GCAGTGATCA CCGTCCCCGC ATACTTCAAT 

451 GATTCTCAAC GAGCATCCAC AAAAGATGCT GGACGCATTG CAGGTCTAGA 

501 TGTAAAACGT ATCATTCCAG AACCTACCGC AGCAGCTCTT GCCTACGGAA 

551 TCGATAAAGT CGGTGATAAA AAAATCGCTG TCTTCGACCT TGGTGGAGGA 

° 601 ACTTTTGATA TCTCCATCCT AGAAATCGGT GATGGCGTCT TCGAAGTTCT 

651 ATCTACAAAT GGAGATACTC TCCTCGGTGG AGACGACTTT GATGAAGTCA 

701 TTATCAAATG GATGATCGAA GAATTCAAAA AACAAGAAGG CATTGATCTT 

751 AGCAAAGATA ATATGGCCTT ACAAAGACTT AAAGATGCTG CTGAGAAAGC 

801 AAAAATAGAA CTTTCAGGAG TCTCTTCCAC AGAAATCAAT CAGCCATTCA 

lK) 851 TCACAATGGA TGCACAAGGA CCTAAACACC TTGCATTGAC ACTCACACGT 

901 GCGCAATTCG AGAAACTCGC AGCCTCTCTA ATCGAAAGAA CAAAATCTCC 

951 ATGCATCAAA GCACTCAGTG ACGCAAAACT TTCCGCTAAG GATATCGATG 

1001 ATGTTCTCTT AGTTGGAGGT ATGTCAAGAA TGCCCGCAGT GCAAGAAACT 

oe . 1051 GTAAAAGAAC TCTTCGGCAA AGAGCCTAAT AAAGGAGTCA ACCCCGACGA 

i:> 1101 AGTTGTTGCT ATTGGAGCCG CAATTCAAGG TGGTGTTCTT GGCGGAGAAG 

1151 TTAAGGATGT TCTACTTCTA GACGTTATCC CCCTATCTCT GGGTATCGAA 

1201 ACTCTAGGAG GCGTCATGAC GACTCTGGTA GAGAGAAATA CTACAATCCC 

1251 TACACAGAAA AAACAAATCT TCTCCACAGC TGCTGATAAC CAGCCTGCGG 

an 1301 TTACCATCGT AGTTCTCCAA GGAGAGCGTC CCATGGCCAA AGATAACAAG 

1351 GAAATCGGAA GATTCGATCT TACAGATATC CCTCCGGCTC CTCGAGGCCA 

1401 TCCTCAAATC GAAGTCTCCT TCGATATCGA TGCAAACGGA ATTTTCCATG 

1451 TCTCAGCTAA AGATGTTGCC AGCGGTAAAG AACAGAAAAT TCGTATCGAA 

1501 GCAAGCTCAG GACTTCAAGA AGATGAAATC CAAAGAATGG TTCGAGATGC 

1551 CGAAATTAAT AAGGAAGAAG ATAAAAAACG TCGTGAAGCT TCAGATGCTA 

35 1601 AAAATGAAGC CGATAGCATG ATCTTCAGAG CCGAAAAAGC TATTAAAGAT 

1651 TATAAGGAGC AAATTCCTGA AACTTTAGTT AAAGAAATCG AAGAGCGAAT 

1701 CGAAAACGTG CGCAACGCAC TCAAAGATGA CGCTCCTATT GAAAAAATTA 

1751 AAGAGGTTAC TGAAGACCTA AGCAAGCATA TGCAAAAAAT TGGAGAGTCT 

An 1801 ATGCAATCGC AGTCTGCATC AGCAGCAGCA TCATCGGCAG CCAATGCTAA 

4U 1851 AGGTGGACCT AACATCAATA CAGAAGATTT GAAAAAACAT AGTTTCAGTA 

1901 CGAAGCCTCC TTCAAATAAC GGTTCTTCAG AAGACCATAT CGAAGAAGCT 

1951 GATGTAGAAA TTATTGATAA CGACGATAAG TAA 

The PSORT algorithm predicts an inner membrane location (0.151). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 52A) and a his- 
45 tagged product. The proteins were used to immunise mice, whose sera were used in Western blot 
(Figure 52B) and FACS (Figure 52C) analyses. 

The cp6790 protein was also identified in the 2D-PAGE experiment (Cpn0503). 

These experiments show that cp6790 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

50 Example 53 

The following ^pneumoniae protein (pid 43 76878) was expressed <SEQ ID 105; cp6878>: 

1 MNVPDSKNLH PPAYELLEIK ARITQSYKEA SAILTAIPPG ILLLSETGHF 

51 L.ICNSQAREI LGIDENLEIL NRSFTDVLPD TCLGFStQEA XiESLKVPKTL 

101 RLSLCKESKE KEVELFIRKN ElSGYXiFIQI RDRSDYKQ^E NAIERYKNIA 

55 151 ELGKMTATLA HEIRNPLSGI VGFASILKKE ISSPRHQRHL SSIISGTRSL 

201 WNXtVS SMLEY TKSQPLNLKI INLQDFFSSL IPLLSVSFPN CKFVREGAQP 
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251 LFRSIDPDRM NSWWNLVKN AVETGNSPIT LTLHTSGDIS VTNPGTIPSE 
3 01 IMDKLFTPFF TTKREGNGLG LAEAQKIIRL HGGDIQLKTS DSAVSFFIII 
3 51 PELLAALPKE RAAS* 

The cp6878 nucleotide sequence <SEQ ID 106> is: 

1 ATGAACGTCC CTGATTCCAA GAACCTCCAT CCTCCTGCAT ACGAACTCCT 

51 AGAGATCAAG GCTCGCATCA CACAATCTTA TAAAGAAGCG AGTGCTATAC 

101 TGACAGCGAT TCCTGATGGT ATCCTATTAC TTTCTGAAAC AGGACACTTT 

151 CTTATCTGCA ATTCACAAGC ACGTGAAATT CTAGGAATTG ATGAAAATCT 

201 AGAAATTCTT AATAGATCCT TTACCGATGT TCTCCCCGAT ACGTGTCTTG 

251 GATTTTCTAT TCAAGAGGCT CTTGAATCTC TAAAAGTCCC TAAAACTCTT 

301 AGACTCTCTC TCTGTAAAGA ATCTAAAGAA AAAGAAGTGG AACTCTTCAT 

351 CCGTAAAAAC GAGATCAGTG GATACCTGTT TATCCAAATC CGCGATCGGT 

401 CCGACTATAA ACAACTAGAA AACGCTATAG AAAGATATAA AAATATCGCA 

451 GAACTTGGGA AAATGACGGC TACCCTAGCT CACGAAATCC GCAATCCGCT 

501 AAGTGGAATC GTTGGATTTG CCTCTATCCT AAAGAAAGAG ATTTCCTCTC 

551 CTCGCCACCA ACGAATGCTC TCCTCAATCA TCTCCGGCAC AAGGTCTCTA 

601 AATAACCTTG TCTCTTCTAT GTTAGAATAT ACAAAATCAC AACCGTTGAA 

651 CCTAAAGATT ATAAATTTAC AAGACTTCTT CTCTTCTCTT ATCCCTCTGC 

701 TCTCCGTCTC TTTCCCGAAT TGCAAGTTTG TAAGAGAGGG CGCACAACCT 

751 CTATTCAGAT CTATAGATCC TGATCGGATG AACAGTGTCG TTTGGAACCT 

801 AGTGAAAAAT GCTGTAGAAA CAGGGAACTC TCCGATCACT CTGACCCTGC 

851 ATACATCGGG AGACATCTCG GTAACGAACC CCGGAACGAT TCCTTCCGAG 

901 ATCATGGACA AGCTCTTCAC TCCATTCTTC ACAACAAAGA GAGAGGGAAA 

951 TGGTTTGGGA CTTGCTGAAG CTCAAAAAAT TATAAGACTC CATCGAGGAG 

1001 ATATC CAATT AAAAACAAGC GACTCCGCCG TTAGCTTCTT CATAATCATC 

1051 CCCGAACTTC TAGCGGCCCT ACCCAAAGAA AGAGCCGCTA G 

The PSORT algorithm predicts an inner membrane location (0.204). 

The protein was expressed in Kcoli and purified as a his-tag product (Figure 53A) and as a GST- 
fusion product. The recombinant GST-fusion protein was used to immunise mice, whose sera were 
used in a Western blot (Figure 53B) and for FACS analysis. 

These experiments show that cp6878 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 54 

The following C.pneumoniae protein (pid 43 77224) was expressed <SEQ ID 107; cp7224>: 

1 MMKKIRKVAIi AVGGSGGHIV PALSVKEAFS REGIDVLLLG KGLKNHPSLQ 

51 QGISYREXPS GLPTVXJWPIK IMSRTLSLCS GYLKARKELK IFDPDLVIGF 

101 GSYHSLPVLL AGLSHKIPLF LHEQNLVPGK VNQLFSRYAR GIGVWFSPVT 

151 KHFRCPAEEV FLPKRSFSLG SPMKKRCTNH TPTICWGGS QGAQILNTCV 

201 PQALVKLVNK YPNMYVHHIV GPKSDVMKVQ HVYNRGEVLC CVKPFEEQXit, 

251 DVXjLAADIiVI SRAGATILEE ILWAKVPGIL IPYPGAYGHQ EVKAKFFVDV 

301 L-EGGTMILEK ELTEKLLVEK VTFALDSHNR EKQRNSLAAY SQQRSTKTFH 

351 AFICECL* 

The cp7224 nucleotide sequence <SEQ ID 108> is: 

1 ATGATGAAGA AAATTCGAAA AGTAGCCTTG GCTGTAGGAG GTTCAGGAGG 

51 CCACATTGTC CCAGCTCTCT CGGTAAAGGA AGCTTTTTCT CGTGAAGGAA 

101 TAGACGTATT AC TACTAGGG AAAGGTCTCA AGAACCATCC TTCTTTGCAA 

151 CAGGGAATCA GCTATCGGGA AATCCCCTCA GGACTTC CTA CAGTCCTTAA 

201 TCCCATAAAG ATCATGAGCA GGACCCTTTC TCTATGTTCA GGATACCTGA 

251 AAGCAAGAAA GGAACTTAAA ATTTTTGACC CTGACCTGGT CATAGGATTT 

301 GGGAGCTACC ACTCTCTTCC CGTGTTGCTC GCAGGACTGT CCCATAAAAT 

351 TCCCTTATTT CTACACGAAC AAAATCTAGT TCCTGGAAAA GTAAATCAAT 

401 TGTTTTCCCG CTATGCTCGA GGTATTGGAG TGAATTTCTC CCCCGTTACT 

451 AAACACTTCC GCTGCCCCGC AGAAGAGGTC TTCCTTCCTA AACGAAGCTT 

501 CTCCTTAGGA AGCCCTATGA TGAAGCGATG TACAAATCAT ACCCCTACAA 

551 TCTGTGTTGT TGGAGGTTCT CAGGGAGCAC AG AT AT T AAA TACTTGTGTT 

601 CCCCAAGCTC TTGTCAAGCT AGTCAATAAG TACCCAAATA TGTACGTCCA 
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651 TCATATTGTA GGACCTAAAA GTGATGTTAT GAAGGTGCAA CATGTTTACA 

7 01 ATCGTGGAGA GGTCCTCTGC TGTGTGAAGC CGTTCGAAGA GCAACTCCTA 

751 GATGTCTTGC TTGCCGCAGA TTTGGTCATC AGTAGGGCAG GAGCCACAAT 

801 TTTAGAAGAA ATTCTTTGGG CAAAAGTTCC CGGAATTTTA ATTCCCTATC 

5 851 CAGGAGCTTA TGGACATCAG GAAGTTAATG CTAAATTCTT TGTAGACGTC 

901 TTAGAAGGGG GAACTATGAT CCTAGAAAAA GAATTAACAG AGAAGCTATT 

951 AGTAGAAAAA GTAACGTTTG CTTTAGACTC CCATAACAGA GAAAAACAAC 

1001 GCAATTCCCT AGCGGCGTAT AGTCAGCAAA GGTCAACAAA AACATTCCAT 

1051 GCATTCAT07T GTGAATGCTT ATAG 

10 The PSORT algorithm predicts an inner membrane location (0.164). 

The protein was expressed in E.coli and purified as a GST-fusion product, as shown in Figure 54A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 54B) and for FACS analysis (Figure 54C). A his-tagged protein was also expressed. 

This protein also showed good cross-reactivity with human sera, including sera from patients with 
15 pneumonitis. 

These experiments show that cp7224 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 55 

The following ^pneumoniae protein (PID 4377140) was expressed <SEQ ID 109; cp7140>: 

20 1 MVRRSXSFCL FFLMTLItCCT SCNSRSLIVH GI.PGREANSI WUUVSKGVA 

51 AQKLPQAAAA TAGAATEQMW DIAVPSAQXT EAXjAILNQAG L.PRMKGTSL*!* 

101 DLFAKQGLVP selqekxryq eglseqmast irkmdgwda svqisftten 

151 EDNLPLTASV YJKHRGVLDN PNSIMVSKIK RklASAVPGL VPENVSWSD 

201 RAAYSDITIN GPWGLTEEID YVSVWGIILA KSSLTKFRLI FYVTilLILFV 

25 251 ISCGLliWVIW KTHTLIMTMG GTKGFFNPTP YTKNALEAKK AEGAAADKEK 

301 KEDADSQGES KWAETSDKDS SDKDAPEGSN EIEGA* 

A predicted signal peptide is highlighted. 

The cp7140 nucleotide sequence <SEQ ID 110> is: 

1 ATGGTTCGTC GATCTATTTC TTTTTGCTTG TTCTTTCTAA TGACATTGCT 

30 51 GTGCTGTACA AGCTGTAACA GCAGGTCTCT AATTGTGCAC GGTCTTCCTG 

101 GCAGAGAAGC GAATGAGATT GTGGTGCTTT TGGTAAGCAA AGGGGTGGCT 

151 GCACAAAAAT TGCCTCAAGC TGCAGCGGCT ACAGCCGGAG CAGCTACTGA 

201 GCAAATGTGG GATATCGCGG TTCCGTCAGC ACAAATCACA GAGGCCCTTG 

251 CCATTCTAAA TCAAGCGGGT CTTCCACGTA TGAAAGGGAC AAGCCTGTTA 

35 301 GATCTFTTTG CAAAACAAGG TCTTGTTCCT TCCGAGCTTC AGGAAAAAAT 

351 CCGTTATCAA GAAGGCTTAT CAGAACAGAT GGCCTCTACG ATTAGAAAAA 

401 TGGATGGCGT TGTCGATGCC TCAGTACAGA TTTCCTTCAC TACAGAAAAT 

451 GAAGATAATC TTCCTTTAAC AGCCTCTGTG TATATTAAGC ATCGAGGGGT 

501 TTTGGACAAT CCGAACAGCA TTATGGTTTC CAAAATTAAG CGCCTTATTG 

40 551 CAAGTGCTGT TCCAGGACTT GTGCCAGAGA ACGTCTCTGT AGTGAGCGAT 

601 CGCGCAGCTT ATAGTGATAT TACAATTAAT GGTCCTTGGG GATTAACAGA 

651 AGAAATCGAT TATGTTTCTG TTTGGGGTAT TATTCTTGCG AAGTCTTCGC 

701 TCACCAAATT CCGTCTCATT TTTTATGTCT TGATTCTCAT TTTATTTGTT 

751 ATTTCTTGTG GTCTCCTTTG GGTCATTTGG AAAACTCATA CTCTCATTAT 

45 801 GACTATGGGA GGTACAAAAG GGTTCTTCAA CCCTACACCA TATACAAAGA 

851 ATGCCTTGGA AGCCAAGAAA GCCGAGGGAG CAGCTGCTGA CAAAGAGAAA 

901 AAAGAAGATG CAGATTCACA GGGGGAAAGC AAAAATGCGG AAACCAGTGA 

951 TAAAGAOTCT AGTGATAAAG ATGCTCCAGA AGGAAGCAAT GAAATTGAGG 

1001 GTGCTTAG 

50 The PSORT algorithm predicts an inner membrane location (0.650). 
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The protein was expressed in Rcoli and purified as a GST-fusion product, as shown in Figure 55A, 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 55B) and for FACS analysis (Figure 55C). A his-tagged protein was also expressed. 

These experiments show that cp7140 is a surface-exposed and immunoaccessible protein, and that it 
5 is a useful immunogen. These properties are not evident from the sequence alone. 

Example 56 

The following C.pneumoniae protein (pid 4377306) was expressed <SEQ ID 111; cp7306>: 

1 MITKQLRSWL AVI,VGSSLIA LPLSGQAVGK KESRVSELPQ DVLLKEISGG 

51 F SKVATKATP AWYIESFPK SQAVTHPSPG KRGPYENPFD YFNDEFFNRF 

10 101 FGLPSQREKP QSKEAVRGTG FLVSPDGYIV TNNHWEDTG KIHVTLHDGQ 

151 KVPATVIGIiD PKTDLAVIKI KSQNLPYLSF GNSDHLKVGD WAIAIGNPFG 

2 01 LQATVTVGVI SAKGRNQLHI ADFEDFIQTD AAINPGNSGG PLLWIDGQVI 

251 GVOTAIVSGS GGYIGIGFAI PSL.MANRIID QLIRDGQVTR GFIiGVTLQPI 

301 DAELAACYKL EKVYGALVTD VYKG S PADKA GLKQEDVIIA YNGKEVDSLS 

15 351 MFRNAVSLMN PDTRIVLKW REGKVIEIPV TVSQAPKEDG MSALQRVGIR 

401 VQNLTPETAK KLGIAPETKG ILIISVEPGS VAASSGIAPG QLILAVNRQK 

451 VSSIEDXiNRT LKDSNNENIL LMVSQGDVTR FIALKFEE* 

A predicted signal peptide is highlighted. 

The cp7306 nucleotide sequence <SEQ ID 112> is: 

20 1 ATGATAACTA AGCAATTGCG TTCGTGGCTA GCTGTACTTG TTGGTTCAAG 

51 TCTGCTAGCT CTTCCTTTAT CAGGGCAAGC TGTCGGGAAA AAAGAATCTC 

101 GAGTTTCCGA GCTGCCTCAA GACGTTCTTC TTAAAGAGAT CTCGGGAGGG 

151 TTTTCTAAGG TCGCTACCAA GGCGACTCCC GCTGTTGTGT ACATAGAAAG 

201 TTTCCCAAAG AGCCAGGCTG TAACACATCC TTCTCCTGGA CGCCGTGGGC 

25 251 CTTATGAAAA TCCTTTTGAT TATTTTAATG ATGAGTTTTT CAATCGTTTT 

301 TTTGGTCTAC CTTCACAGAG GGAAAAACCT CAAAGTAAAG AGGCGGTTCG 

351 AGGAACAGGT TTCCTAGTAT CTCCAGATGG CTATATTGTG ACTAATAACC 

401 ATGTTGTCGA AGATACAGGT AAGATTCACG TAACTCTTCA TGATGGGCAA 

451 AAGTACCCAG CAACTGTAAT CGGACTCGAT CCTAAAACAG AC CTTGC AGT 

30 501 CATTAAAATT AAATCCCAAA ACCTCCCGTA TCTTTCTTTT GGAAACTCCG 

551 AC CACTTAAA AGTCGGAGAT TGGGCAATTG CAATTGGAAA TCCCTTCGGT 

601 CTTCAAGCTA CGGTCACCGT AGGTGTCATC AGTGCTAAAG GAAGAAATCA 

651 ACTCCACATT GCAGATTTTG AAGATTTTAT TCAGACAGAT GCTGCGATTA 

701 ATCCAGGCAA CTCTGGAGGC CCTCTTCTAA ATATTGATGG ACAGGTCATC 

35 751 GGTGTTAATA CTGCCATTGT CAGTGGTAGT GGTGGCTATA TTGGAATCGG 

801 GTTTGCGATT CCTAGCCTTA TGGCAAATAG AATCATAGAT CAGCTGATTC 

851 GTGATGGTCA AGTTAC CCGA GGATTCTTAG GAGTGACTTT AC AAC CTATA 

901 GATGCGGAAC TCGCTGCTTG CTACAAACTC GAAAAGGTTT ATGGCGCTTT 

951 AGTCACAGAT GTTGTTAAAG GATCTCCAGC AGATAAAGCA GGGCTAAAAC 

40 10 01 AAGAAGATGT GATCATTGCT TATAATGGGA AAGAAGTCGA TTCACTGAGT 

1051 ATGTTCCGTA ATGCTGTTTC TTTAATGAAT CCAGATACAC GTATTGTTCT 

1101 AAAGGTAGTT CGTGAAGGAA AGGTTATCGA AATACCCGTG ACAGTTTCTC 

1151 AAGCTCCAAA AGAAGATGGA ATGTCGGCTT TACAGCGTGT GGGAATCCGT 

1201 GTGCAAAACC TAACTCCTGA AACTGCTAAG AAGCTGGGAA TTGCTCCAGA 

45 1251 GACTAAAGGC ATTTTGATTA TAAGTGTTGA ACCAGGGTCT GTAGCAGCTT 

1301 CTTCAGGAAT TGCTCCTGGT CAGCTGATCC TTGCTGTGAA TAGACAAAAA 

1351 GTATCTTCGA TTGAAGATCT GAATAGAACG TTAAAAGATT CTAACAATGA 

1401 GAATATTCTT CTTATGGTTT CTCAAGGAGA TGTTATTCGC TTCATTGCCC 

1451 TGAAACCTGA AGAATAA 

50 The PSORT algorithm predicts a periplasmic location (0.923). 

The protein was expressed in Rcoli and purified as a his-tag product (Figure 56 A) and as a GST- 
fusion product (Figure 56B). The recombinant proteins were used to immunise mice, whose sera 
were used in a Western blot (Figure 56C) and for FACS (Figure 56D) analyses. 
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The cp7306 protein was also identified in the 2D-PAGE experiment (Cpn0979) and showed good 
cross-reactivity with human sera, including sera from patients with pneumonitis. 

These experiments show that cp7306 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 57 

The following C.pneumoniae protein (pid 4377132) was expressed <SEQ ID 113; cp7132>: 

1 MCNSIAMKKQ KRGFVLMELL MSFTLIA kLIj GTLGFWYRKI YTVQKQKERI 

51 YNFYIEESRA YKQLRTLFSM SLSSSYEEPG SLFSLIFDRG VYRDPKLAGA 

101 VRASLHHDTK DQRLELRICN IKDQSYFETQ RLLSHVTHW LSFQRJSfPDFE 

151 KLPETIALTI TREPKAYPPR TLTYOFAVGK* 

A predicted signal peptide is highlighted. 

The cp7 132 nucleotide sequence <SEQ ID 1 14> is: 

1 ATGTGTAACT CTATAGCTAT GAAAAAGCAA AAGCGTGGCT TTGTGCTTAT 

51 GGAATTACTC ATGTCGTTCA CTCTAATTGC TTTGTTATTA GGGACTTTAG 

101 GATTTTGGTA TCGGAAAATT TATACTGTAC AAAAGCAAAA AGAACGTATT 

151 TATAACTTTT ATATCGAAGA AAGCCGAGCC TACAAGCAGC TCAGAACCCT 

2 01 GTTTAGCATG TCCTTGTCTT CATCTTACGA GGAGCCTGGA TCATTATTTT 
251 CTTTAATCTT TGATCGGGGT GTTTATCGAG ATCCTAAGCT GGCAGGTGCG 

3 01 GTACGAGCTT CTCTCCATCA TGACACCAAG GATCAGAGAT TGGAACTTCG 
3 51 TATTTGTAAT ATTAAGGATC AGTCTTACTT TGAAACACAG CGACTGCTCT 
401 CCCACGTGAC CCATGTTGTA CTTTCCTTCC AGAGAAATCC TGATCCTGAA 
451 AAACTTCCTG AAACAATTGC TTTAACTATA ACACGGGAAC CTAAAGCATA 
501 TCCTCCAAGG ACGTTAACAT ACCAATTTGC GGTTGGGAAA TAA 

The PSORT algorithm predicts a periplasmic location (0.915). 

The protein was expressed in Exoli and purified as a his-tag product (Figure 57 A) or as a 
GST-fiision. The recombinant proteins were used to immunise mice, whose sera were used in a 
Western blot (Figure 57B) and FACS (Figure 57C) analyses. 

These experiments show that cp7132 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 58 



The following C.pneumoniae protein (pid 


4376733) was expressed <SEQ ID 115; cp6733>: 


i 


MKTSIPWVLV 


SS VLAFSCHL 


QSLANEEIiLS 


PDDSFNGWID 


SGTFTPKTSA 


51 


TTYSLTGDVF 


FYEFGKGTPL 


SDSCFKQTTD 


NLTFLGNGHS 


LTFGFIDAGT 


101 


HAGAAASTTA 


NKNLTFSGFS 


LLSFDSSPST 


TVTTGQGTLS 


SAGGVNLENI 


151 


RKLWAGNFS 


TADGGAIKGA 


SFLLTGTSGD 


ALFSNNSSST 


KGGAIATTAG 


201 


ARIANNTGYV 


RFI»SNIASTS 


GGAIDDEGTS 


ILSNNKFDYF 


EGNAAKTTGG 


251 


AICNTKASGS 


PELIISNNKT 


LIFASWAET 


SGGAIHAKKL 


ALSSGGFTEF 


301 


LRNNVSSATP 


KGGAISIDAS 


GELSLSAETG 


NITFVRNTLT 


TTGSTDTPKR 


351 


NAIN1GSWGK 


FTELRAAKNH 


TIFFYDPITS 


EGTSSDVLKI 


NWGSAGALNP 


401 


YQGTIXjFSGE 


TLTADELKVA 


DNJjKSSFTQP 


VSLSGGKLLL 


QKGVTLESTS 


451 


FSQEAGSXiLG 


MDSGTTLSTT 


AGSITITNLG 


INVDSLGDKQ 


PVSLTAKGAS 


501 


WKVIVSGKLN 


IjIDIEGNIYE 


SHMFSHDQIiF 


SLLKITVDAD 


VOTNVDISSL 


551 


IPVPAEDPNS 


EYGFQGQWNV 


NWTTDTATOT 


KEATATWTKT 


GFVPSPERKS 


601 


ALVCNTLWGV 


FTDIRSLQQL 


VEIGATGMEH 


KQGFWVSSMT 


NFLHKTGDEN 


651 


RKGFRHTSGG 


YVIGGSAHTP 


KDDLFTFAFC 


HLFARDKDCF 


IAHWNSRTYG 


701 


GTLFFKHSHT 


LQPQNYLRLG 


RAKFSESAIE 


KFPREIPLAL 


DVQVSFSHSD 


751 


NRMETHYTSL 


PESEGSWSlsrE 


CIAGGIGLDL 


PFVLSNPHPL 


FKTFIPQMKV 


801 


EMVYVSQNSF 


FESSSDGRGF 


SIGRLLNliSI 


PVGAKFVQGD 


IGDSYTYDLS 
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851 GFFVSDVYRN NPQSTATLVM SPDSWKIRGG NLSRQAFLLR GSNNYVYNSN 
901 CELFGHYAME LRGSSRNYNV DVGTKLRF* 



A predicted signal peptide is highlighted. 

The cp6733 nucleotide sequence <SEQ ID 116> is: 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



l 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 
2701 
2751 



ATGAAGACTT 
ATGTCACCTA 
GCTTTAATGG 
ACAACATATT 
CACTCCCTTA 
TCTTGGGGAA 
CATGCAGGTG 
AGGGTTTTCC 
CAGGTCAGGG 
CGTAAACTTG 
CAAAGGAGCG 
GTAACAACTC 
GCTCGCATAG 
GTCTACGTCA 
ACAACAAATT 
GCGATCTGCA 
CAATAAGACT 
CCATCCATGC 
CTACGAAATA 
CGATGCCTCA 
TTGTAAGAAA 
AATGCGATCA 
TAAAAATCAT 
CATCAGACGT 
TATCAAGGAA 
TAAAGTTGCT 
CCGGAGGAAA 
TTCTCTCAAG 
ATCAACTACA 
ACTCC TTAGG 
AATAAAGTGA 
CATTTATGAA 
AAATCACGGT 
ATCCCTGTTC 
ATGGAATGTT 
CGGCAACTOG 
GCGTTAGTAT 
GCAACAGCTT 
TCTGGGTTTC 
CGC AAAGGC T 
TCACACTCCT 
CTAGAGACAA 
GGAACTTTAT 
GAGATTAGGA 
GGGAAATTCC 
AACCGTATGG 
GAGCAACGAG 
TTTCCAACCC 
GAAATGGTTT 
CCGTGGTTTT 
CGAAATTCGT 
GGATTCTTTG 
TCTTGTGATG 
GACAGGCATT 
TGTGAGCTCT 
CTACAATGTA 



CGATTCCTTG 
CAGTCACTAG 
AAATATCGAT 
CTCTAACAGG 
TCTGACAGTT 
CGGTCATAGC 
CTGCTGCATC 
TTACTGAGTT 
AACGCTTTCC 
TAGTTGCTGG 
TCTTTCCTTT 
TTCATCAACA 
CAAATAACAC 
GGAGGCGCTA 
TCTATATTTT 
ACACCAAGGC 
CTGATCTTTG 
TAAAAAGCTA 
ATGTCTCATC 
GGAGAGCTCA 
TACCCTTACA 
ACATAGGAAG 
ACAATTTTCT 
ATTGAAGATA 
CGATTCTATT 
GACAATTTAA 
GTTATTGCTA 
AGGCCGGTTC 
GCTGGGAGTA 
TCTTAAGCAG 
TCGTATCTGG 
AGTCATATGT 
TGATGCTGAT 
CTGCTGAGGA 
AATTGGACTA 
GACCAAAACA 
GCAATACCCT 
GTAGAGATCG 
CTCCATGACG 
TCCGTCATAC 
AAAGACGACC 
AGATTGTTTT 
TCTTCAAGCA 
AGAGCAAAGT 
CCTAGCCTTG 
AAACGCACTA 
TGTATAGCTG 
ACATCCTCTT 
ATGTATCACA 
AGTATTGGAA 
GCAGGGGGAT 
TTTCCGATGT 
AGCCCAGACT 
TTTACTGAGG 
TCGGACATTA 
GATGTTGGTA 



GGTTTTAGTT 
CTAACGAGGA 
TCAGGAACGT 
AGATGTCTTC 
GTTTTAAGCA 
TTAACGTTTG 
TACAACAGCA 
TTGATTCCTC 
TCAGCAGGAG 
GAATTTTTCT 
TAACTGGCAC 
AAGGGAGGAG 
AGGTTATGTT 
TCGATGATGA 
GAAGGGAATG 
GAGTGGATCT 

CTTCAAACGT 
GCCCTTTCCT 

AGCAACTCCT 
GTCTTTCTGC 
ACAACCGGAA 
TAACGGGAAA 
TCTATGATCC 
AATAACGGCT 
TTCTGGAGAA 
AATCTTCATT 
CAAAAGGGAG 
TCTCCTCGGC 
TTACAATCAC 
CCCGTCAGCC 
GAAGCTCAAC 
TCAGCCATGA 
GTTGATACTA 
TCCTAATTCA 
CGGATACAGC 
GGATTTGTTC 
ATGGGGAGTC 
GCGCAACTGG 
AACTTCCTGC 
CTCTGGAGGC 
TATTTACCTT 
ATCGCTCACA 
CTCTCATACC 
TTTCTGAATC 
GATGTCCAAG 
TACCTCATTG 
GTGGTATCGG 
TTCAAGACCT 
AAATAGCTTC 
GGCTGCTTAA 
ATCGGAGATT 
CTATCGTAAC 
CTTGGAAAAT 
GGTAGCAACA 
CGCTATGGAA 
CCAAACTCCG 



TCCTCCGTGT 
ACTTTTATCA 
TT ACTCC AAA 
TTTTACGAGC 
AAC CACGGAC 
GCTTTATAGA 
AATAAGAATC 
TCCTAGCACA 
GCGTAAATTT 
ACTGCAGATG 
TTCTGGAGAT 
CAATTGCTAC 
AGATTCCTAT 
AGGCACGTCG 
CAGCGAAAAC 
CCTGAACTGA 
AGCAGAAACA 
CTGGAGGCTT 
AAGGGGGGTG 
AGAGACAGGA 
GTACCGATAC 
TTCACGGAAT 
CATCACTTCA 
CTGCGGGAGC 
ACCCTAACAG 
CACGCAGCCA 
TCACTTTAGA 
ATGGATTCAG 
GAACCTAGGA 
TAACAGCAAA 
CTGATTGATA 
CCAGCTCTTC 
ACGTTGACAT 
GAATACGGAT 
TACAAATACA 
CCAGCCCCGA 
TTTACTGACA 
TATGGAACAC 
ATAAGACTGG 
TACGTCATCG 
TGCGTTCTGC 
ACAACTCTAG 
CTACAACCCC 
AGCTATAGAA 
TTTCGTTCAG 
CCAGAATCCG 
CCTAGACCTT 
TCATTCCACA 
TTCGAAAGCT 
CCTCTCGATT 
CCTACACCTA 
AATCCCCAAT 
TCGCGGTGGC 
ACTACGTCTA 
CTCCGTGGAT 
ATTOTAG 



TAGC TTTCTC 

CCTGATGATA 

AACTTCAGCC 

CTGGAAAAGG 

AATCTTACCT 

TGCTGGCACT 

TTACCTTCTC 

ACGGTTACTA 

AGAAAATATT 

GTGGAGCTAT 

GCTCTTTTTA 

TACAGCAGGC 

CTAACATAGC 

ATACTATCGA 

TACTGGCGGT 

TAATCTCTAA 

AGCGGTGGCG 

TACAGAGTTT 

CTATCAGCAT 

AACATTACCT 

TCCTAAACGT 

TACGGGCTGC 

GAAGGAACCT 

TCTCAATCCA 

CAGATGAACT 

GTCTCCCTAT 

GAGCACGAGC 

GAACGACATT 

ATCAATGTTG 

AGGTGCTTCA 

TTGAAGGGAA 

TCTCTATTAA 

CAGCAGCCTT 

TCCAAGGACA 

AAAGAGGCCA 

AAGAAAATCT 

TTCGCTCTCT 

AAACAAGGTT 

AGATGAAAAT 

GTGGAAGTGC 

CATCTCTTTG 

AACCTACGGT 

AAAACTATTT 

AAATTCCCTA 

CCATTCAGAC 

AAGGTTCTTG 

CCTTTTGTTC 

GATGAAAGTC 

CTAGTGATGG 

CCTGTGGGTG 

TGATCTCTCA 

CTACAGCGAC 

AATCTTTCAA 

CAACTCCAAT 

CTTCAAGGAA 



The PSORT algorithm predicts an outer membrane location (0.924). 
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The protein was expressed in Kcoli and purified as a his-tag product, as shown in Figure 58 A. The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
58B) and for FACS (Figure 58C) analyses. A GST-fusion protein was also expressed. 

The cp6733 protein was also identified in the 2D-PAGE experiment (Cpn045 1). 

5 These experiments show that cp6733 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 59 

The following C.pneumoniae protein (pid 4376814) was expressed <SEQ ID 1 17; cp6814>: 

in 1 MHDALfcSILA IQELDIKMIR LMRVKKEHQK ELAKVQSLKS DIRRKVQEKE 

1U 51 LEMENLKTQI RIX5ENRIQEI SEQINKLENQ QAAVKKMDEF NALTQEMTTA 

101 NKERRSLEHQ LSDLMDKQAG GEDLIVSLKE SLASTENSSS VIEKEIFESI 

151 KKINEEGKAL LEQRTELKHA TNPELLSIYE RLLWWKKDRV WPIENRVCS 

201 GCHXVLTPQH EWLVRKKDRL IFCEHCSRIL YWQESQVNAQ ENS TAKRRRR 

251 RAAV* 

15 The cp6814 nucleotide sequence <SEQ ID 118> is: 



20 



25 



30 

The PSORT algorithm predicts an inner membrane location (0.070). 

The protein was expressed in Kcoli and purified as a GST-fusion (Figure 59A) or his-tagged 
product. The recombinant proteins were used to immunise mice, whose sera were used in Western 
35 blot (Figure 59B) and FACS (Figure 59C) analyses. 

These experiments show that cp6814 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 60 

The following ^pneumoniae protein (pid 437683 o) was expressed <SEQ ID 119; cp6830>: 

40 1 MKWLPATA VF AAVLPAIiTAF GD PASVTCTKnp SHTGSGDPTS DAALTGFTQS 

51 STETDGTTYT IVGDITFSTF TNIFVPWTP DANDSSSNSS KGGSSSSGAT 

101 SLIRSSNLHS DFDFTKDSVL DLYHLFFPSA SNTLNPALLS SSSSGGSSSS 

151 SSSSSSGSAS AWAADPKGG AAFYSNEANG TLTFTTDSGN PGSLTLQNLK 

201 MTGDGAAIYS KGFLVFTGLK NLTFTGNESQ KSGGAAYTEG ALTTQAIVBA 

45 251 VTFTGKTSAG QGGAIYVKEA TLFNALDSLK FEKNTSGQAG GGIYTESTLT 

301 ISNITKSIEF ISNKASVPAF APEPTSPAPS SLINSTTIDT STLQTRAASA 

351 TPAVAPVAAV TPTPISTQET AGNGGAIYAK QGISISTFKD LTFKSWSASV 



1 


ATGCATGACG 


51 


AATGATTCGC 


101 


AAGTC CAATC 


151 


CTCGAAATGG 


201 


CCAAGAGATT 


251 


TAAAAAAAAT 


301 


AACAAAGAAC 


351 


GCAAGCTGGA 


401 


CTACAGAAAA 


451 


AAAAAGATTA 


501 


AAAGCATGCG 


551 


ACAATAAAAA 


601 


GGTTGTCATA 


651 


AGACCGACTC 


701 


AATCCCAAGT 


751 


CGCGCAGCTG 
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401 DATLTVDSST IGESGGAIFA ADSIQIQQCT GTTLFSGNTA NKSGGGIYAV 

451 GQVTLEDIAN LiKMTNNTCKG EGGAIYTKKA LTINNGAILT TFSGNTSTDN 

501 GGAIFAVGGI TLSDLVEVRF SKNKTGNYSA PITKAASNTA PWSSSTTAA 

551 SPAVPAAAAA PVTNAAKGGA LYSTEGLTVS GITSILSFEN NECQNQGGGA 

3 601 YVTKTFQCSD SHRLQFTSNK AADEGGGI/YC GDDVTIiTNLT GKTLFQENSS 

651 EKHGGGLSLA SGKSLTMTSL ESFCLNANTA KENGGGANVP ENIVLTFTYT 

701 PTPNEPAPVQ QPVYGEALVT GNTATKSGGG IYTKNAAFSN LSSVTFDQNT 

751 SSENGGAIiLT QKAADKTDCS FTYITNVNIT NNTATGNGGG IAGGKAHFDR 

A 801 IDNLTVQSNQ AKKGGGVYLE DAliILEKVIT GSVSQNTATE SGGGIYAKDI 

^ 851 QLQALPGSFT ITDNKVETSIj TTSTNIjYGGG IYSSGAVTIjT NISGTFGITG 

901 NSVINTATSQ DADIQGGGIY ATOSLSINQC NTPILFSNNS AATKKT STTK 

951 QIAGGAIFSA AVTIENNSQP IIFLNNSAKS EATTAATAGN KDSCGGAIAA 

1001 NSVTLTNNPE ITFKGNYAET GGAIGCIDLT NGSPPRKVSI ADNG SVLFQD 

iOSl NSALNRGGAI YGET1D1SRT GATFIGNSSK HDGSAICCST ALTLAFNSQL 

^ HOI IFENNKVTET TATTKASINN LGAAIYGNNE TSDVTISLSA ENGSIFFKNN 

1151 IjCTATNKYC S IAGNVKFTAI EASAGKAISF YDAVNVSTKE TNAQELKLNE 

1201 KATSTGTIIiF SGELHENKSY IPQKVTFAHG NLILGKNAEL SWSFTQSPG 

1251 TTITMGFGSV IiSNHSKEAGG IAINNVIIDF SEIVPTKDNA TVAPPTLKLV 

1301 SOTNADSKDK IDITGTVTLL DPWGNLYQNS YLGEDRDITL FNIDNSASGA 

20 1351 VTATNVTLQG NLGAKKGYLG TWNLDPNSSG SKIILKWTFD KYLRWPYIPR 

1401 DNHFY1NSIW GAQNSLVTVK OGILGNMLJNN ARFEDPAFNN FWASAIGSFI, 

1451 RKEVSRNSDS FTYHGRGYTA AVDAK PRQEF ILGAAFSQVF GHAESEYHLD 

1501 NYKHKGSGHS TQASLYAGNI FYFPAIRSRP ILFQGVATYG YMQHDTTTYY 

1551 PSIEEKNMAN WDSIAWLFDL, RFSVDLKEPQ PHSTARLTFY TEAEYTR IRQ 

25 1601 EKFTELDYDP RSFSACSYGN LAIPTGFSVD GALAWREIIL YNKVSAAYLP 

1651 VILRNNPKAT YEVLSTKEKG NWNVLPTRN AARAEVSSQI YX.GSYWTLYG 

1701 TYTIDASMNT LVQMANGGIR FVF* 

A predicted signal peptide is highlighted. 

The cp6830 nucleotide sequence <SEQ ID 120> is: 

30 1 ATGAAGTGGC TACCAGCTAC AGCTGTTTTT GCTGCCGTAC TCCCCGCACT 

51 AACAGCCTTC GGAGATCCCG CGTCTGTTGA AATAAGT AC C AGCCATACAG 

101 GATCCGGGGA TCCTACAAGC GACGCTGCCT TAACAGGATT TACACAAAGT 

151 TCCACAGAAA CTGACGGTAC TACCTATACC ATTGTCGGTG ATATCACCTT 

201 CTCTACTTTT ACGAATATTC CTGTTCCCGT AGTAACTCCA GACGCCAACG 

35 251 ATAGTTCCAG CAATAGCTCT AAAGGAGGAA GTAGCAGTAG TGGAGCTACA 

301 TCTCTAATCC GATCCTCAAA CCTACACTCC GATTOTGATT TTACAAAAGA 

351 TAGCGTGTTA GACCTCTATC ACCTTTTCTT TCCTTCAGCT TCAAATACTC 

401 TCAATCCTGC ACTCCTTTCT TCCAGTAGCA GCGGTGGATC CTCGAGCAGC 

451 AGTAGCTCCT CATCATCTGG AAGTGCATCT GCTGTTGTTG CTGCGGACCC 

40 501 AAAAGGAGGC GCTGCCTTTT ATAGTAACGA GGCTAACGGA ACTTTAACCT 

551 TCACTACAGA CTCTGGAAAT CCCGGCTCCC TGACTCTTCA GAATCOTAAA 

601 ATGACCGGAG ATGGAGCCGC CATCTACTCG AAGGGTCCTC TAGTATTTAC 

651 TGGTTTAAAA AATCTAACCT TTACAGGAAA TGAATCTCAG AAATCTGGAG 

701 GTGCTGCCTA TACTGAAGGC GCACTCACAA CACAAGCAAT CGTTGAAGCC 

45 751 GTAACTTTTA CTGGCAACAC CTCGGCAGGG CAAGGAGGCG CTATCTATG1 1 

801 TAAAGAAGCT ACCCTATTCA ATGCTCTAGA CAGCCTCAAA TTTGAAAAAA 

851 ACACTTCTGG GCAAGCTGGT GGTGGAATCT ATACAGAGTC TACGCTCACA 

901 ATCTCGAACA TCACAAAATC TATTGAATTT ATCTCTAATA AAGCTTCTGT 

951 CCCTGCCCCC GCTCCTGAGC CCACCTCrcC GGCTCCAAGT AGCTTAATAA 

50 1001 ATTCTACAAC GATCGATACC TCGACTCTCC AAACCCGAGC AGCATCCGCA 

1051 AC TCC AGCAG TGGCTCCTGT TGCTGCCGTA ACTCCAACAC CAATCTCTAC 

1101 TCAAGAGACC GCAGGAAATG GAGGCGCTAT CTATGCTAAA CAAGGTATTT 

1151 CGATATCCAC GTTTAAAGAT CTGACCTTCA AGTCTAACTC TGCATCGGTA 

1201 GATGCCACCC TTACTGTCGA TTCTAGCACT ATTGGAGAAT CTGGAGGTGC 

55 1251 TATCTTTGCA GCAGACTCTA TACAAATCCA ACAGTGCACG GGAACCACCT 

1301 TATTCAGTGG CAATACTGCC AATAAGTCTG GTGGGGGTAT TTACGCTGTA 

1351 GGACAAGTCA CCCTAGAAGA TATAGCGAAT CTGAAGATGA CCAACAACAC 

1401 CTGTAAAGGT GAAGGTGGAG CCATCTACAC TAAAAAGGCT TTAACTATCA 

1451 ACAACGGTGC CATTCTCACT ACATTTTCTG GAAATACATC GACAGATAAT 

60 1501 GGTGGGGCTA TTTTTGCTGT AGGTGGCATC ACTCTCTCTG ATCTTGTAGA 

1551 AGTCCGCTTT AGTAAAAATA AGACCGGAAA TTATTCCGCT CCTATTACCA 

1601 AAGCGGCTAG CAACACAGCT CCTGTAGTTT CTAGCTCTAC AACTGCTGCA 

1651 TCTCCTGCGG TCCCTGCTGC CGCTGCAGCA CCTGTTACAA ACGCAGCAAA 

1701 AGGAGGGGCT TTATATAGTA CAGAAGGACT GACTGTATCT GGAATCACAT 

65 1751 CGATATTGTC GTTTGAAAAC AACGAATGCC AGAATCAAGG AGGTGGGGCT 
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1801 TACGTTACTA AAACCTTCCA GTGTTCCGAT TCTCATCGCC TCCAGTTTAC 

1851 TAGTAATAAA GCAGCAGATG AAGGCGGGGG CCTGTATTGT GGTGACGATG 

1901 TCACGCTAAC GAACCTGACA GGGAAAACAC TATTTCAAGA GAATAGCAGT 

1951 GAGAAACATG GAGGTGGGCT CTCTCTCGCC TCAGGAAAAT CTCTGACTAT 

5 2001 GACATCGTTA GAGAGCTTCT GCTTAAATGC AAATACAGCA AAGGAAAACG 

2051 GAGGCGGTGC GAATGTCCCT GAAAATATTG TACTCACCTT CACC TATACT 

2101 CCCACTCCAA ATGAACCTGC GCCTGTGCAG CAGCCCGTGT ATGGAGAAGC 

2151 TCTTGTTACT GGAAATACAG CCACAAAAAG TGGTGGGGGC ATTTACACGA 

2201 AAAATGCGGC CTTCTCAAAT TTATCTTCTG TAACTTTTGA TCAAAATACC 

1^ 2251 TCTTCAGAAA ATGGTGGTGC CTTACTTACC CAAAAAGCTG CAGATAAAAC 

2301 GGACTGTTCT TTCACCTATA TTACAAATGT CAATATCACC AACAATACAG 

2351 CTACAGGAAA TGGTGGGGGC ATTGCTGGGG GAAAAGCACA TTTCGATCGC 

2401 ATTGATAATC TTACAGTCCA AAGCAACCAA GCAAAGAAAG GTGGTGGGGT 

2451 TTATCTTGAA GATGCCCTCA TCCTGGAAAA GGTTATTACA GGTTCTGTCT 

15 2501 CACAAAATAC AGCTACAGAA AGTGGTGGGG GTATCTACGC TAAGGATATT 

2551 CAACTACAAG CTCTACCTGG AAGCTTCACA ATTAC CGATA ATAAAGTCGA 

2601 AACTAGTCTT ACTACTAGCA CTAATTTATA TGGTGGGGGC ATCTATTCCA 

2651 GTGGAGCTGT CACGCTAACC AATATATCTG GAACCTTTGG CATTACAGGA 

2701 AACTCTGTTA TCAATACAGC GACATCCCAG GATGCAGATA TACAAGGTGG 

20 2751 GGGCATTTAT GCAACCACGT CTCTCTCAAT AAATCAATGT AATACACCCA 

2801 TTCTATTTAG CAACAACTCT GCTGCCACTA AAAAAACATC AACAACAAAG 

2851 CAAATTGCTG GTGGGGC TAT CTTCTCCGCT GCAGTAACTA TCGAGAATAA 

2901 CTCTCAGCCC ATTATTT TCT TAAATAATTC CGCAAAGTCG GAAGCAACTA 

2951 CAGCAGCAAC TGCAGGAAAT AAAGATAGCT GTGGAGGAGC CATTGCAGCT 

25 3001 AACTCTGTTA CTTTAACAAA TAAC CCTGAA ATAACCTTTA AAGGAAATTA 

3051 TGCAGAAACT GGAGGAGCGA TTGGCTGTAT TGATCTTACT AATGGCTCAC 

3101 CTCCCCGTAA AGTCTCTATT GCAGACAACG GTTCTGTCCT TTTTCAAGAC 

3151 AACTCTGCGT TAAATCGCGG AGG CGCTATC TATGGAGAGA CTATCGATAT 

3201 CTCCAGGACA GGTGCGACTT TCATCGGTAA CTCTTCAAAA CATGATGGAA 

30 3251 GTGCAATTTG CTGTTCAACA GCCCTAACTC TTGCGCCAAA CTCCCAACTT 

3301 ATCTTTGAAA ACAATAAGGT TACGGAAACC ACAGCCACTA CAAAAGCTTC 

3351 CATAAATAAT TTAGGAGCTG CAATTTATGG AAATAATGAG ACTAGTGACG 

3401 TCACTATCTC TTTATCAGCT GAGAATGGAA GTATTTTCTT TAAAAACAAT 

3451 CTATGCACAG CAACAAACAA ATACTGCAGT ATTGCTGGAA ACGTAAAATT 

35 3501 TACAGCAATA GAAGCTTCAG CAGGGAAAGC TATATCTTTC TATGATGCAG 

3551 TTAACGTTTC CACCAAAGAA ACAAATGCTC AAGAGCTAAA ATTAAATGAA 

3601 AAAGCGACAA GTACAGGAAC GATTCTATTT TCTGGGGAAC TTCACGAAAA 

3 651 TAAATCCTAT ATTCCACAGA AAGTCACTTT CGCACATGGG AATCTCATTC 

3701 TAGGTAAAAA TGCAGAACTT AGCGTAGTTT CCTTTACCCA ATCTCCAGGC 

40 3751 ACCACAATCA CTATGGGCCC AGGATCGGTT CTTTCCAACC ATAGCAAAGA 

3801 AGCAGGAGGA ATCGCTATAA ACAATGTCAT CATTGATTTT AGTGAAATCG 

3851 TTCCTACTAA AGATAATGCA ACAGTAGCTC CACCCACTCT TAAATTAGTA 

3901 TCGAGAACTA ATGCAGATAG TAAAGATAAG ATTGATATTA CAGGAACTGT 

3951 GACTCTTCTA GATCCTAATG GCAACTTATA TCAAAATTCT TATCTTGGTG 

45 4001 AAGACCGCGA TATCACTCTT TTCAATATAG ACAATTCTGC AAGTGGGGCA 

4051 GTTACAGCCA CGAATGTCAC CCTTCAAGGG AATTTAGGAG CTAAAAAAGG 

4101 ATATTTAGGA ACCTGGAATT TGGATCCAAA TTCCTCGGGT TCAAAAATTA 

4151 TTCTAAAATG GACCTTTGAC AAATACCTGC GCTGGCCCTA CATCCCTAGA 

4201 GACAACCACT TCTACATCAA CTCTATTTGG GGAGCACAAA ACTCTTTAGT 

50 4251 GACTGTGAAA CAAGGGATCT TAGGGAACAT GTTGAACAAT GCAAGGTTTG 

4301 AAGATCCTGC TTTCAACAAC TTCTGGGCTT CGGCTATAGG ATCTTTCCTT 

4351 AGGAAAGAAG TATCTCGAAA TTCTGACTCA TTCACCTATC ATGGCAGAGG 

4401 CTATACCGCT GCTGTGGATG CCAAACCTCG C CAAGAATTT ATTTTAGGAG 

4451 CTGCCTTCAG TCAGGTTTTT GGTCACGCCG AGTCTGAATA TCACCTTGAC 

55 4501 AACTATAAGC ATAAAGGCTC AGGTCACTCT ACACAAGCAT CTCTTTATGC 

4551 TGGCAATATC TTCTATTTTC CTG CGATACG GTCTCGGCCT ATTCTATTCC 

4601 AAGGTGTGGC GACCTATGGT TATATGCAAC ATGACACCAC AACCTACTAT 

4651 CCTTCTATTG AAGAAAAAAA TATGGCAAAC TGGGATAGCA TTGCTTGGTT 

4701 ATTTGATCTG CGTTTCAGTG TGGATCTTAA AGAACCTCAA CCTCACTCTA 

60 4751 CAGCAAGGCT TACCTTCTAT ACAGAAGCTG AGTATACCAG AATTCGC CAG 

4801 GAGAAATTCA CAG AGC TAG A CTATGATCCT AGATCTTTCT CTGCATGCTC 

4851 TTATGGAAAC TTAGCAATTC CTACTGGATT CTCTGTAGAC GGAGCATTAG 

4901 CTTGGCGTGA GATTATTCTA TATAATAAAG TATCAGCTGC GTACCTCCCT 

4951 GTGATTCTCA GGAATAATCC AAAAGCGACC TATGAAGTTC TCTCTACAAA 

65 5001 AGAAAAGGGC AACGTAGTCA ACGTTCTCCC TACAAGAAAC GCAGCTCGTG 

5051 CAGAGGTGAG CTCTCAAATT TATCTTGGAA GTTACTGGAC ACTCTACGGC 

5101 ACGTATACTA TTGATGCTTC AATGAATACT TTAGTGCAAA TGGCCAACGG 

5151 AGGGATCCGG TTTGTATTC T AG 
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The PSORT algorithm predicts an outer membrane location (0.926). 

The protein was expressed in Ecoli and purified as a GST-fusion (Figure 60A) or his-tagged 
product. The recombinant proteins were used to immunise mice, whose sera were used in Western 
blot (Figure 60B) and FACS (Figure 60C) analyses. 

5 The cp6830 protein was also identified in the 2D-PAGE experiment (Cpn0540) and showed good 
cross-reactivity with human sera, including sera from patients with pneumonitis. 

These experiments show that cp6830 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 61 

10 The following ^pneumoniae protein (pir> 4376854) was expressed <SEQ ID 121; cp6854>: 

1 MSIAIAREQY AAILDMHPKP SIAMFSSEQA RTSWEKRQAH PYLYRLLEII 

51 WGWKFIiLGb IFFIPLGLFW VLQKICQNFI LLGAGGWIFR PICRDSNLLR 

101 QAYAARLFSA SFQDHVSSVR RVCLQYDEVF IDGLELRLPN AKPDRWMLIS 

151 NGNSDCLEYR TVLQGEKDWI FRIAEESQSN IIjIFNYPGVM KSQGNITRNN 

15 201 WKSYQACVR YLRDEPAGFQ ARQIVAYGYS LGASVQAEAL SKEIADGSDS 

2 51 VRWFWKDRG ARSTGAVAKQ FIGSLGVWIA NIVTHWNINSE KRSKDLHCPE 

301 LFIYGKDSQG NLIGDGIjFKK ETCFAAPFLD PKNLEECSGK KIPVAQTGLR 

351 HDHILSDDVI KEVAGHIQRH FDN* 

The cp6854 nucleotide sequence <SEQ ID 122> is: 

20 1 ATGTCAATAG CTATTGCAAG GGAACAATAC GCAGCTATAT TGGATATGCA 

TCGATCGCCA TGTTTTCTTC GGAGCAGGCG AGAACTTCTT 
ACAGGCTCAT CCTTACCTTT ATCGTCTTCT TGAGATCATA 
TGAAATTTCT TCTCGGCTTA ATCTTCTTTA TTCCCTTGGG 
GTCCTTCAGA AGATATGTCA GAATTTTATT CTTCTTGGTG 
25 251 CAGGAGGGTG GATTTTTAGA CCCATATGCA GGGACTCTAA TTTATTGCGA 

CCGCGCGTCT TTTCTCCGCT TCATTCCAAG ATCATGTCTC 
AGGGTTTGCT TACAGTATGA CGAGGTCTTT ATTGACGGAT 
TCTTCCCAAT GCTAAGCCAG ATCGATGGAT GTTAATCTCC 
CCGATTGCTT AGAGTATAGG ACAGTGCTGC AAGGGGAAAA 
30 501 GGACTGGATA TTCCGTATTG CTGAAGAGTC TCAATCCAAC ATTTTAATCT 

AGGAGTCATG AAGAGCCAAG GGAATATAAC AAGAAACAAT 
CTTATCAAGC ATGCGTACGC TATCTTAGAG ATGAACCCGC 
GCGCGTCAAA TCGTTGCTTA TGGCTATTCT TTAGGAGCTA 
CGAAGCATTA AGTAAAGAGA TCGCAGACGG AAGTGATAGC 
35 751 GTCCGTTGGT TTGTCGTTAA AGATCGAGGA GCTCGCTCTA CAGGAGCCGT 

TTTATTGGAA GTCTAGGAGT TTGGCTGGCG AATCTTACCC 
TAATTCTGAA AAGAGAAGCA AGGACTTGCA TTGCCCAGAA 
ATGGCAAGGA TTCCCAAGGT AATCTTATCG GGGATGGATT 
GAGACGTGCT TCGCAGCACC ATTTTTAGAT CCTAAAAACT 
40 1001 TGGAAGAGTG TTCAGGGAAG AAAATCCCTG TAGCTCAGAC CGGTCTAAGA 

TCCTTTCCGA TGATGTGATT AAAGAAGTTG CAGGTCATAT 
TTCGATAATT A 

The PSORT algorithm predicts an inner membrane location (0.461). 

The protein was expressed in Exoli and purified as a GST-fusion product, as shown in Figure 61A. 
45 The recombinant protein was used to immunise mice, whose sera were used in Western blot (Figure 
61B) and FACS (Figure 61C) analyses. A his-tagged protein was also expressed. 

These experiments show that cp6854 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 
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1 


ATGTCAATAG 


51 


TCCTAAACCT 


101 


GGGAGAAACG 


151 


TGGGGTGTTG 


201 


TCTTTTCTGG 


251 


CAGGAGGGTG 


301 


CAAGCTTACG 


351 


CTCTGTGCGA 


401 


TGGAGTTACG 


451 


AATGGAAACT 


501 


GGACTGGATA 


551 


TCAATTACCC 


601 


GTAGTCAAAT 


651 


AGGACCTCAG 


701 


GTGTTCAAGC 


751 


GTCCGTTGGT 


801 


TGCTAAACAG 


851 


ATTGGAATAT 


901 


CTCTTTATTT 


951 


GTTCAAAAAA 


1001 


TGGAAGAGTG 


1051 


CACGATCATA 


1101 


TCAAAGACAT 
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Exampk 62 

The following C.pneunioniae protein (pid 4377101) was expressed <SEQID 123; cp7101>: 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



1 


MYSCYSKGIS 


51 


KAYRTTALQS 


101 


YPLGPHRHNE 


151 


HTLALNPQTI 


201 


RFLKDLNDLI 


251 


liVKLSSSPGL 


301 


TANDIIKSTL 


351 


RVYHYLHAYE 


401 


LGWKSEDPHS 


451 


MRNPLNMQDS 


501 


FYTKQIPLYF 


551 


SINEFIRFLS 


601 


EALLTRILEA 


651 


EPLTLTEKHP 


701 


FSIIAGSPLF 


751 


ENFCNKYALQ 


801 


IYIRRLLYLM 


851 


TIPKMTIiLSS 


901 


APLLFADSNW 


951 


SRPWTI.YANP 



HNYLLHPMSR 
PLAAKNLNIA 
AQDREHLLKM 
LSTIHVRQAA 
SSGKLSRIVN 
KKAFSAANLI 
LHYYQLQEST 
EAKSAFIHDT 
LVSLVTHFVE 
QILTMDHMRF 
RSSYDAFIQE 
EFFTSTESEL 
YQLPVPPSIL 
ENPHELAAFY 
REAWDNDWYS 
HVVHDFHDFC 
VREVPYVSEQ 
ADLRHIYKGL 
PSIYFGFILN 
IDYGMPPPPG 



LDIFVFDSLI 
RKVANYILAD 
LKALKENPKL 
LTALFTYLRQ 
QREIAVPXNL 
ETLGDSEAQI 
VRAIFFKEGL 
QNPLIjKAWEY 
EEVENXRILV 
RQELNKALYE 
FAHLYA3SIAPA 
LGKHAVINLE 
NHLDQLSQTP 
ADALKDLPTG 
YTWLRDVWVK 
SDHSLTLPEL 
QLPEVLDNVS 
IjMQSYQKIYT 
PGTTEIDLWK 
YRSRL PKEFF 



The cp7101 nucleotide sequence <SEQ ID 124> is: 



1 


ATGTATTCGT 


51 


TATGTCACGT 


101 


ATCAAAATCT 


151 


AAAGCCTACC 


201 


AAATATCGCC 


251 


TCGATACAGT 


301 


TATCCTTTAG 


351 


CCTTAAAATG 


401 


TCAAAACTCT 


451 


CATACACTAG 


501 


TCAAGCAGCA 


551 


CCTGTTTTGC 


601 


GGATTCCTTA 


651 


AATCGTAAAC 


701 


TTGGAGAGCT 


751 


CTGGTTAAGC 


801 


CAATCTTATT 


851 


TCTCGCATCA 


901 


ACTGCTAACG 


951 


AGAAAGTACT 


1001 


AACAAGTGGC 


1051 


CGGGTATACC 


1101 


CCATGACACT 


1151 


CTCTTGCGGA 


1201 


TTAGGATGGA 


1251 


CTTTCTTGAA 


1301 


AACAGACCTA 


1351 


ATGCGCAACC 


1401 


CATGCGCTTC 


1451 


CTCAAGAAAA 


1501 


TTCTATACAA 


1551 


CATTCAAGAA 


1601 


TTCTTTTCAC 


1651 


TCGATTAATG 


1701 


GTCAGAACTT 


1751 


CTCGGCTCGT 


1801 


GAAGCTCTCC 


1851 


CTCCATCTTA 


1901 


TTTCTGGAGG 


1951 


GAACCTCTGA 


2001 


AGCTTTCTAC 



GTTACAGCAA 
TTGGATATTT 
TCTTGAGGAA 
GTACTACGGC 
CGTAAAGTCG 
AAAGCTTGTC 
GGCCTCATCG 
CTAAAAGCTC 
CTTTGTCCCT 
CATTGAATCC 
CTCACAGCGC 
TACGGCTCCT 
AAGATCTCAA 
CAAAGGGAAA 
ATTCAAGCCT 
TCTCCTCATC 
GAAACTCTTG 
ATATTTGATG 
ACATTATCAA 
GTACGAGCTA 
ATTCTCGACG 
ACTACTTACA 
CAAAATCCCT 
TGCTAGCCAA 
AAAGTGAAGA 
GAGGAAGTAG 
TCACGAAGCA 
CACTAAATAA 
CGTCAAGAAC 
GGCAAAGAAA 
AGCAAATTCC 
TTTGCTCATC 
GCATGGACGC 
AATTTATACG 
GTGGGGAAAC 
CCACAACATC 
TTACAAGAAT 
AACCACTTAG 
AACAGTGGAC 
CACTTACAGA 
GCAGACGCCC 



AGGAATATCC 
TTGTTTTCGA 
ATTTTCTGTT 
TCTACAATCC 
CAAATTATAT 
GAAGCCATTC 
CCATAATGAA 
TAAAGGAAAA 
TCATACTCTA 
ACAGACAATT 
TCTTCACCTA 
GCCATTCTCA 
TGATCTCATT 
TTGCGGTTCC 
TTAAGGATTC 
TCCAGGACTC 
GGGATTCTGA 
CAAAAACTAC 
ATCGACACTT 
TTTTCTTCAA 
CAACACCCCA 
TGCCTATGAA 
TACTGAAAGC 
CCTACCATCT 
CCCTCACAGT 
AAAACATCCG 
CGCTCCCAAC 
TCAAGACAGT 
TCAATAAAGC 
TTTCTACATC 
CTTATACTTT 
TCTATGCTAA 
ACCCATCCGA 
TTTTCTTTCT 
ATGCCGTGAT 
ACTGCCATGC 
TTTAGAAGC C 
ATCAGCTGTC 
ACTCTTCTTT 
AAAGCATCCT 
TTAAAGATCT 



ANQDQNIiLEE 
NGEIDTVKLV 
KESIKTLFVP 
DVG SCFATAP 
SGCIGELFKP 
QQLLSHQYLM 
FSKEQVAFST 
TLATIjADASQ 
QQCEQTYHEA 
WDSAQEKAKK 
GFRILFTHGR 
KETSRLVHNI 
WVYVSGGTVD 
IKSYLEEGSH 
QHQDFLQDTI 
YDKGSRFLSS 
SYLGISSRIT 
EEDTYLRLTT 
FNYAGLQGQP 



CATAACTATC 
TTCTCTGATC 
CTGAAGACAC 
CCTCTAGCTG 
CTTAGCTGAC 
ACCATCTCTC 
GCTCAAGATC 
TCCTAAATTA 
CAATCCAAAA 
CTCTCTACGA 
CCTTCGGCAA 
TTCACCAAGA 
AGCAGTGGCA 
TATAAACCTT 
TAGATCTTTA 
AAAAAAGCCT 
AGCACAAATC 
AAAATGTCCA 
CTGCACTACT 
AGAAGGGTTG 
GAGAGCTCTC 
GAAGCAAAAT 
CTGGGAGTAT 
CAAACCATAT 
CTTGTATCTC 
AATTTTAGTC 
TAGAATATAT 
CAGATTTTGA 
TCTTTATGAG 
TTCOTGAATT 
CGTAGTTCTT 
TGCTCCCGCT 
ACACATGGTC 
GAATTCTTCA 
CAATTTAGAG 
TACACACGGA 
TATCAGCTTC 
ACAAACTCCC 
TGGATTATTT 
GAAAATCCTC 
CCCTACAGGA 



IFCSEDTVLF 
EAIHHLSQCT 
SYSTIQNLIR 
AILIHQEYPE 
LRIIjDLYPDP 
QKLQNVHETLi 
QHPRELSEIQ 
PTISNHIRLA 
RSQLEYXEGR 
FLHLPEFLLS 
THPNTWSPtY 
TAMLHTDVFQ 
TLIiIiDYFESS 
SLLSSSPTHV 
LPQLSIYAFI 
LFTKDKTVAL 
YEKFRSLIEE 
AMRHHNLAYP 
LDN IQEIiFAT 



TTCTACATCC 
GCAAACCAGG 
AGTTTTATTT 
CTAAGAACCT 
AATGGGGAAA 
ACAATGTACC 
GTGAACACCT 
AAAGAAAGCA 
CCTAATTCGC 
TTCATGTGCG 
GATGTAGGTT 
ATATCCAGAA 
AACTCTCTAG 
TCGGGATGCA 
TCCTGATCCT 
TTTCTGCTGC 
CAACAGTTGC 
TGAGACCTTA 
ATCAGCTCCA 
TTCAGCAAAG 
AGAAATACAA 
CTGCTTTTAT 
ACTTTAGCGA 
CCGCCTTGCC 
TAGTTACACA 
CAACAATGTG 
TGAAGGGCGG 
CGATGGATCA 
TGGGATAGTG 
CTTACTTTCT 
ACGATGCCTT 
GGCTTCCGTA 
CCCCATCTAT 
CCTCCACAGA 
AAAGAAACAT 
TGTTTTCCAA 
CTGTGCCTCC 
TGGGTTTATG 
TGAAAGCTCA 
ATGAGCTTGC 
ATTAAAAGTT 
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2051 ATCTAGAAGA AGGATCCCAC TCTCTACTTA GCTCATCACC CACCCACGTT 

2101 TTCTCTATAA TCGCAGGATC TCCTTTATOT CGGGAAGCTT GGGATAATGA 

2151 TTGGTACAGC TATACCTGGC TTCGTGATGT CTGGGTGAAA CAACACCAAG 

2201 ATTTCCTTCA AGATACTATA TTACCTCAGC TAAGTATCTA TGCTTTCATA 

5 2251 GAGAATTTTT GTAACAAATA TGCTTTGCAA CATGTAGTTC ATGACTTTCA 

2301 TGATTTCTGC TCCGACCACT CCTTGACTCT TCCGGAGCTC TATGACAAAG 

2351 GATCGCGTTT TCTAAGCTCC TTATTCACCA AAGATAAGAC CGTAGCTCTT 

2401 ATCTATATAC GCCGTCTTCT CTAC CTTATG GTCCGTGAAG TCCCTTATGT 

2451 TTCAGAACAA CAGCTTCCAG AAGTCTTAGA TAACGTCTCT TCATATCTCG 

10 2501 GGATTTCCTC TCGTATTACC TATGAGAAAT TCCGCTCCCT GATAGAGGAA 

2551 ACCATCCCTA AAATGACCTT ACTCTCCTCA GCAGACCTGA GGCATATCTA 

2601 TAAAGGTCTC CTCATGCAAA GTTATCAAAA GATCTACACC GAAGAAGATA 

2651 CGTACCTCCG CCTCACCACG GCAATGAGGC ATCATAATCT TGCCTATCCC 

2701 GCTCCTTTGC TCTTTGCAGA CAGTAACTGG CCTTCTATTT ATTTTGGATT 

15 2751 CATCCTAAAT CCAGGAACCA CAGAGATCGA TCTTTGGAAA TTTAACTATG 

2801 CAGGGCTGCA AGGACAGCCT CTTGACAATA TCCAGGAGCT GTTCGCAACG 

2851 TCAAGACCCT GGACCCTCTA TGCAAATCCT ATAGATTATG GCATGCCACC 

2901 GCCTCCAGGC TACCGCAGCC GCCTCCCTAA AGAATTTTTC TAG 

The PSORT algorithm predicts a cytoplasmic location (0.206). 
20 The protein was expressed in Rcoli and purified as a GST-fusion (Figure 62A) or his-tagged 
product. The proteins were used to immunise mice, whose sera were used in Western blot (Figure 
62B) and FACS (Figure 62C) analyses. 

This protein also showed good cross-reactivity with human sera, including sera from patients with 
pneumonitis. 

25 These experiments show that cp7101 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 63 

The following ^pneumoniae protein (pid 4377107) was expressed <SEQ ID 125; cp7107>: 

1 MSIVRNSALP LPCLSRSETF KKVRSHMKFM KVLTPWIYRK DLWVTAFIjLT 

30 51 AIPGSFAHTli VDIAGEPRHA AQATGVSGDG KIVIGMKVPD DPFAITVGFQ 

101 YIDGHLQPLE AVRPQCSVYF NGITPDGTVI VGTNYAIGMG SVAVKWVNGK 

151 VSELPMLPDT LDSVASAVSA DGRVIGGNRN INLGASVAVK WEDDVITQLP 

201 SliPDAMNACV NGXSSDGSII VGTMVDVSWR NTAVQWIGDQ LSVIGTLGGT 

251 TSVASAXSTD GWIVGGSEN AD SQTHAYAY KNGVMSDIGT LGGFYSLAHA 

35 301 VSSDGSVIVG VSTNSEHRYH AFQYADGQMV DLGTLGGPES YAQGVSGDGK 

351 VIVGRAQVPS GDWHAFLCPF QAPSPAPVHG GSTWTSQNP RGMVDINATY 

401 SSLKNSQQQL QRLLIQHSAK VESVSSGAPS FTSVKGAISK QSPAVQKDVQ 

451 KGTFLSYRSQ VHGNVQNQQL LTGAFMDWKL ASAPKCGFKV ALHYGSQDAL 

501 VERAALPYTE QGLGSSVIiSG FGGQVQGRYD FNLGETWLQ PFMGIQVLHL 

40 551 SREGYSEKNV RFPVSYDSVA YSAATSFMGA HVFASLSPKM STAATLGVER 

601 DLNSHIDEFK GSVSAMGNFV LENSTVSVLR PFASLAMYYD VRQQQLVTLS 

651 WMNQQPLTG TLSLVSQSSY NLSF* 

The cp7107 nucleotide sequence <SEQ ID 126> is: 

1 ATGAGTATAG TCAGAAATTC TGCATTGCCA CTTCCGTGTT TAAGCAGATC 

45 51 CGAAACCTTT AAAAAAGTTA GGTCGCATAT GAAATTTATG AAAGTCCTTA 

101 CTCCATGGAT TTATCGAAAA GATCTTTGGG TAACAGCATT CTTACTGACA 

151 GCAATTCCAG GATCTTTTGC ACATACTCTT GTTGATATAG CAGGAGAACC 

201 TCGGCATGCT GCTCAAGCAA CAGGAGTTTC TGGAGATGGT AAAATTGTTA 

251 TAGGAATGAA AGTTCCGGAT GATCCTTTTG CTATAACTGT AGGATTTCAA 

50 301 TATATTGATG GGCATTTGCA ACCCTTAGAG GCAGTACGTC CTCAATGCTC 

351 TGTATACCCT AATGGTATAA CCCCGGACGG AACGGTTATT GTGGGTACAA 

401 ACTATGCCAT CGGGATGGGT AGTGTTGCTG TGAAATGGGT AAATGGCAAG 

451 GTTTCTGAAC TTCCCATGCT CCCTGACACC CTCGATTCTG TAGCATCGGC 

501 AGTTTCTGCA GATGGAAGAG TGATTGGAGG GAATAGAAAT ATAAATCTTG 

55 551 GCGCTTCTGT TGCTGTGAAA TGGGAGGACG ACGTGATTAC ACAACTTCCT 

601 TCTCTTCCTG ATGCTATGAA TGCTTGTGTT AACGGAATTT CTTCAGATGG 
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651 TTCTATAATT GTAGGAACCA TGGTAGACGT GTCATGGAGA AATACCGCAG 

701 TACAATGGAT CGGGGATCAG CTCTCTGTTA TTGGGACTTT AGGAGGAACT 

751 ACTTCTGTTG CTAGTGCAAT CTCAACAGAT GGCACTGTGA TTGTAGGAGG 

801 TTCTGAAAAT GCAGATTC TC AGACTCATGC CTATGCTTAT AAAAACGGTG 

851 TTATGAGCGA TATAGGGACC CTCGGAGGTT TTTATTCTTT AGCACATGCA 

901 GTATCTTCAG ATGGTTCTGT GATTGTAGGA GTATCCACGA ACTCTGAGCA 

951 TAGATATCAT GCATTCCAAT ATGCTGATGG ACAGATGGTA GATTTAGGAA 

1001 CTTTAGGAGG GCCTGAATCT TATGCTCAAG GTGTGTCTGG AGATGGAAAG 

1051 GTAATTGTGG GTAGAGCACA AGTAC C ATCT GGAGATTGGC ATGCGTTCCT 

1101 ATGTCCTTTC CAAGCTCCGA GCCCTGCTCC TGTCCATGGG GGAAGCACTG 

1151 TCGTAACTAG CCAGAATCCA CGTGGAATGG TAGATATCAA TGCTACGTAC 

1201 TCCTCTTTGA AAAATAGCCA ACAACAACTA CAAAGATTGC TTATCCAGCA 

1251 TAGTGCAAAA GTTGAAAGTG TATCCTCAGG AGCACCATCT TTTACAAGTG 

1301 TGAAAGGTGC GATCTCAAAA CAGAGCCCTG CAGTGCAAAA TGATGTACAG 

1351 AAAGGGACGT TTTTAAGTTA CCGTTCCCAA GTTCATGGAA ACGTGCAGAA 

1401 TCAGCAATTG CTCACAGGAG CTTTTATGGA CTGGAAACTC GCTTCAGCTC 

1451 CTAAATGCGG CTTTAAAGTA GCTCTCCACT ATGGCTCTCA AGATGCTCTC 

1501 GTAGAACGTG CAGCTCTTCC TTACACAGAA CAAGGCTTAG GAAGCAGTGT 

1551 CTTGTCAGGT TTTGGAGGAC AAGTTCAAGG ACGCTATGAC OTTAATTTAG 

1601 GAGAAACTGT TGTTCTGCAA CCCTTTATGG GCATTCAAGT TCTCCACCTA 

1651 AGTAGAGAAG GGTATTCTGA GAAGAATGTT CGATTTCCTG TAAGCTATGA 

1701 TTCTGTAGCC TACTCAGCAG CTACTAGCTT TATGGGTGCG CATGTATTTG 

1751 CCTCCCTAAG CCCTAAAATG AGTACAGCAG CAACTTTAGG TGTGGAGAGA 

1801 GATCTGAATT CACATATAGA TGAATTTAAG GGATCCGTCT CTGCTATGGG 

1851 AAACTTTGTC TTGGAAAATT CTACAGTGAG TGTTTTAAGA CCTTTTGCTT 

1901 CTCTTGCTAT GTACTATGAC GTAAGACAAC AGCAACTCGT GACGTTGTCA 

1951 GTAGTTATGA ATCAACAACC CTTAACAGGC ACACTAAGCT TAGTAAGCCA 

2001 AAGTAGCTAT AATCTOAGCT TCTAA 

The PSORT algorithm predicts an inner membrane location (0.100). 

The protein was expressed in Exoli and purified as a GST-fusion (Figure 63A) or his-tagged 
product. The proteins were used to immunise mice, whose sera were used in Western blot (Figure 
63B) and FACS (Figure 63C) analyses. 

These experiments show that cp7107 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 64 



The following C.pneumoniae protein (pid 4376467) was expressed <SEQ ID 127; cp6467>: 

1 MLRFFAVFIS TIiWLITSG CS PSQSSKGIFV VNMKEMPRSL DPGKTRLIAD 

51 QTLMRHLYEG LVEEHSQNGE IKPALAESYT ISEDGTRYTF KIKNILWSNG 

101 DPLTAQDFVS SWKEILKEDA SSVYLYAFLP IKNARAIFDD TESPKNLGVR 

151 ALDKRHLEIQ LETPCAHFLH FLTLPIFFPV HETLRNYSTS FEEMPITCGA 

2 01 FRPVSIjEKGL RLHLEKWPMY HNKSRVKLHK IIVQFISNAN TAAILFKHKK 
251 LDWQGPPWGE PIPPEISASL HQDDQLFSLP GASTTWLLFN IQKKPWNNAK 

3 01 LRKAliSLAID KDMLTKWYQ GLAEPTDHIL HPRLYPGTYP ERKRQNERIL 
3 51 EAQQLFEEAL DELQMTREDL EKETLTFSTF SFSYGRICQM LREQWKKVLK 
401 FTIPIVGQEF FTIQKNFLEG NYSLTVWQWT AAFIDPMSYL MIFANPGGIS 
451 PYHLQDSHFQ TbLIKITQEH KKHLRNQLII EALDYLEHCH ILEPLCHPNL 
501 RXALNKNIKN FNLFVRRTSD FRFIEKL* 

A predicted signal peptide is highlighted. 



The cp6467 nucleotide sequence <SEQ ID 128> is: 

1 ATGCTCCGTT TCMCGCTGT ATTTATATCA ACTCTTTGGC TCATTACCTC 

51 AGGATGTTCC CCATCCCAAT CCTCTAAAGG AATTTTTGTG GTAAATATGA 

101 AGGAAATGCC ACGCTCCTTG GATCCTGGAA AAACTCGTCT CATTGCAGAC 

151 CAAACTCTAA TGCGTCATOT ATATGAAGGA CTCGTCCAAG AACATTCCCA 

201 AAATGGAGAG ATTAAACCAG CCCTTGCAGA AAGCTACACC ATCTCCGAAG 

251 ACGGGACTCG GTACACATTT AAAATCAAAA ACATCCTTTG GAGTAACGGA 

301 GACCCTCTGA CAGCTCAAGA CTTTGTCTCC TCTTGGAAGG AAATCCTAAA 
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351 GGAAGATGCG TCCTCCGTAT ATCTCTATGC GTTTTTACCT ATCAAAAATG 

401 CTCGGGCAAT CTTTGATGAT ACTGAGTCTC CAGAAAATCT AGGAGTCCGA 

451 GCTTTAGATA AGCGTCATCT CGAAATTCAG TTAGAAACTC CCTGCGCGCA 

501 TTTCCTACAT TTCTTGACTC TTCCTAOTTT TTTCCCTGTT CATGAAACTC 

5 551 TGCGAAACTA TAGCACCTCT TTTGAAGAGA TGCCCATTAC CTGCGGTGCT 

601 TTCCGCCCTG TGTCTCTAGA AAAAGGCCTG AGACTCCATC TAGAGAAAAA 

651 CCCTATGTAC CATAATAAAA GCCGTGTGAA ACTACATAAA ATTATTGTAC 

701 AGTTTATCTC AAACGCTAAC ACTGCAGCCA TTCTATTCAA ACATAAGAAA 

751 TTAGATTGGC AAGGACCTCC TTGGGGAGAA CCTATCCCTC CAGAAATCTC 

10 801 AGCTTCTCTA CATCAAGATG ACCAGCTCTT TTCTCTTCCG GGCGCTTCGA 

851 CTACATGGTT ACTCTTTAAT ATACAAAAAA AACCTTGGAA CAATGCTAAA 

901 TTACGCAAGG CATTGAGCCT TGCAATAGAC AAAGATATGT TAACCAAAGT 

951 GGTATACCAA GGTCTTGCAG AACCTACAGA TCATATCCTA CATCCAAGAC 

1001 TTTATCCAGG GACCTATCCC GAACGGAAAA GACAAAACGA AAGAATTCTT 

15 1051 GAGGCTCAAC AACTCTTTGA AGAAGCTCTA GACGAACTTC AAATGACACG 

1101 CGAAGATCTA GAAAAGGAAA CTTTGACTTT CTCAACCTTT TCTOTTTCTT 

1151 ACGGAAGGAT TTGCCAAATG CTAAGAGAAC AATGGAAGAA AGTCTTAAAA 

1201 TTTACTATCC CTATAGTAGG CCAAGAGTTT TTCACAATAC AAAAAAACTT 

1251 CCTAGAGGGG AACTATTCCC TAACCGTGAA CCAATGGACC GCAGCATTTA 

20 1301 TTGATCCGAT GTCTTATCTC ATGATCTTTG CCAATCCTGG AGGAATTTCC 

1351 CCCTATCACC TCCAAGATTC ACACTTTCAA ACTCTTCTCA TAAAGATCAC 

1401 TCAAGAACAT AAAAAACACC TACGAAATCA GCTTATTATT GAAGCCCTTG 

14 51 ACTATTTAGA ACACTGTCAC ATTCTCGAAC CACTATGTCA TCCAAATCTT 

1501 CGAATTGCTT TGAACAAAAA CATTAAAAAC TTTAATCTTT TTGTTCGACG 

25 1551 AACTTCAGAC TTTCGTTTTA TAGAAAAACT ATAG 

The PSORT algorithm predicts an outer membrane lipoprotein (0.790). 

The protein was expressed in Kcoli and purified as a his-tag product and a GST-fusion protein, as 
shown in Figure 64 A. The recombinant his-tag protein was used to immunise mice, whose sera were 
used in a Western blot (Figure 64B). The recombinant GST-fusion protein was also used to 
30 immunise mice, whose sera were used in a Western blot (Figure 64C) and for FACS analysis (Figure 
64D). 

These experiments show that cp6467 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 65 

35 The following C.pneumomae protein (pid 4376679) was expressed <SEQ ID 129; cp6679>: 

1 MRKMLVLLAS kGLLSPQ^SS CTHLGSSGSY HPKLYTSGSK TKGVIAMLPV 

51 FHRPGKSLEP LPWNLQGEFT EEISKRFYAS EKVFLIKHNA SPQTVSQFYA 

101 FIANRLPETI IEQFLPAEFI VATELLEQKT GKEAGVDSVT ASVRVRVFDI 

151 RHHKIALIYQ EIIECSQPLT TLVNDYHRYG WNSKHFDSTF MGLMHSRLFR 

40 201 EWARVEGYV CANYS* 

A predicted signal peptide is highlighted. 

The cp6679 nucleotide sequence <SEQ ID 130> is: 

1 ATGCGAAAAA TGTTGGTATT ATTGGCATCT TTAGGACTTC TATCCCCAAC 

51 CCTATCCAGC TGCACTCACT TAGGCTCTTC AGGAAGTTAT CATCCTAAGC 

45 101 TATACACTTC AGGGAGCAAA ACTAAAGGTG TGATTGCGAT GCTTCCTGTA 

151 TTTCATCGCC CAGGAAAGAG TCTTGAACCT TTACCTTGGA ACCTCCAAGG 

201 AGAATTTACT GAAGAGATCA GCAAAAGGTT TTATGCTTCG GAAAAGGlCT 

251 TCCTGATCAA GCACAATGCT TCACCTCAGA CAGTCTCTCA GTTCTATGCT 

301 CCGATTGCGA ATCGTCTACC CGAAACAATT ATTGAGCAAT TTCTTCCTGC 

50 351 AGAATTCATT GTTGCTACAG AACTGTTAGA ACAAAAGACA GGGAAAGAAG 

401 CAGGTGTCGA TTCTGTAACA GCGTCTGTAC GTGTTCGCGT TTTTGATATC 

451 CGTCATCATA AAATAGCTCT CATTTATCAA GAGATTATCG AATGCAGCCA 

501 GCCTTTAACT ACCCTAGTCA ATGATTATCA TCGCTATGGC TGGAACTCAA 

551 AACATTTTGA TTCAACGCCC ATGGQCTTAA TGCATAGCCG TCTTTTCCGC 
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601 GAAGTTGTTG CCAGAGTTGA GGGCTATGTT TGTGCTAACT ACTCGTAG 

The PSORT algorithm predicts an inner membrane location (0.149). 

The protein was expressed in Kcoli and purified as a his-tag product (Figure 65A) and as a GST- 
fusion product (Figure 65B). The recombinant protein was used to immunise mice, whose sera were 
5 used in a Western blot (Figure 65C) and for FACS analysis. 

These experiments show that cp6679 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 66 

The following Cpneumoniae protein (pid 4376890) was expressed <SEQ ID 131; cp6890>: 

^ 1 MKOX*I*FCVCV FAMSCSAYA S PRRQDPSVMK ETFRHNYGII VSGQEWVKRG 

51 SDGTITKVLK NGATLHEVYS GGLLHGEITL TFPHTOALDV VQIYDQGRXjV 

101 SRKTFFVNGL PSQEELFNED GTFVLTRWPD NNDSDTITKP YFIETTYQGH 

151 VXEGSYTSFN GKYSSSIHNG E:37RSVFSSN NILLSEETFN EGVMVKYTTF 

201 YPNRDPESIT HYQNGQPHGli RI/TYLQGGIF NTIEEWRYGF QDGTTIVFKN 

AD 251 GCKTSEIAYV KGVKEGLELR YNEQEIVAEE VSWRNDFLHG ERKIYAGGIQ 

301 KHBWYYRGRS VSKAKFERLN AAG* 

A predicted signal peptide is highlighted. 

The cp6890 nucleotide sequence <SEQ ID 132> is: 

on 1 ATGAAACAAT TACTTTTCTG TGTTTGCGTA TTTGCTATGT CATGTTCTGC 

ZU 51 TTACGCATCC CCACGACGAC AAGATCCTTC TGTTATGAAG GAAACATTCC 

101 GAAATAATTA TGGCATTATT GTTTCCGGTC AAGAATGGGT AAAGCGTGGT 

151 TCTGACGGCA CCATCACCAA AGTACTCAAA AATGGAGCTA CCCTGCATGA 

201 AGTTTATTCT GGAGGCCTCC TTCATGGGGA AATTACCTTA ACGTTTCCCC 

251 ATACCACAGC ATTGGACGTT GTTCAAATCT ATGATCAAGG TAGACTCGTT 

^ 301 TCTCGCAAAA CCTTTTTTGT GAACGGTCTT CCATCTCAAG AAGAGCTGTT 

351 CAATGAAGAT GGCACGTTTG TCCTCACACG ATGGCCGGAC AACAACGACA 

401 GTGATACCAT CACAAAGCCT TACTTCATAG AAACGACATA TCAAGGGCAT 

451 GTCATAGAAG GAAGTTATAC TTCCTTTAAT GGGAAATACT CCTCATCCAT 

501 CCACAATGGA GAGGGAGTTC GTTCTGTGTT CTCCTCCAAT AACATCCTTC 

3U 551 TTTCTGAAGA GACCTTCAAT GAAGGTGTCA TGGTGAAATA TACCACATTC 

601 TATCCGAATC GCGATCCCGA ATCGATTACT CATTATCAAA ATGGACAGCC 

651 TCACGGCTTA CGGCTAACAT ATCTACAAGG TGGCATCCCC AATACGATAG 

701 AGGAGTGGCG TTATGGCTTT CAAGACGGAA CGACCATCGT ATTTAAAAAT 

751 GGTTGTAAGA CATCTGAGAT CGCTTATGTT AAGGGAGTGA AAGAAGGTTT 

801 AGAACTGCGC TACAATGAAC AGGAAATTGT AGCTGAAGAA GTTTCTTGGC 

851 GTAATGATTT TCTGCATGGA GAACGTAAGA TCTATGCTGG AGGAATCCAA 

901 AAGCATGAAT GGTATTACCG CGGGAGATCT GTATCTAAAG CCAAATTCGA 

951 GCGGCTAAAT GCTGCAGGAT AG 

The PSORT algorithm predicts an outer membrane location (0.940). 

40 The protein was expressed in Kcoli and purified as a GST-fusion product, as shown in Figure 66A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 66B) and for FACS analysis. A his-tagged protein was also expressed. 

These experiments show that cp6890 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

45 Example 67 

The following Cpneumoniae protein (pid 6172323) was expressed <SEQ ID 133; cp0018>: 
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1 MKTSVSMLLA LLCSGASSIV LHA ATTPLNP EDGFIGEGNT NTFSPKSTTD 

51 AAGTTYSLTG EVLYIDPGKG GSITGTCFVE TAGDLTFLGN GNTLKFLSVD 

101 AGANIAVAHV QGSKNLSFTD FLSLVITESP KSAVTTGKGS LVSLGAVQLQ 

151 DINTLVLTSN ASVEDGGVIK GNSCLIQGIK WSAIFGQNTS SKKGGAISTT 

5 201 QGLTIENNLG TLKFNENKAV TSGGALDLGA ASTFTANHEL IFSQNKTSGN 

251 AANGGAINCS GDLTFTDNTS LLLQENSTMQ DGGAliCSTGT ISITGSDSIN 

301 VIGNTSGQKG GAISAASLKI LGGQGGALFS NNWTHATFIi GGAIFINTGG 

351 SLQLFTQGGD IVFEGNQVTT TAPNATTKRN VIHLESTAKW TGLAASQGNA 

401 IYFYDPITTN DTGASDNLRI NEVSANQKLS GSIVFSGERL STAEAIAENL 

10 451 TSRINQPVTL VEGSLVLKQG VTLITQGFSQ EPESTLLLDL GTSL* 

A predicted signal peptide is highlighted. 

The cp0018 nucleotide sequence <SEQ ID 134> is: 

1 ATGAAGACTT CAGTTTCTAT GTTGTTGGCC CTGCTTTGCT CGGGGGCTAG 

51 CTCTATTGTA CTCCATGCCG CAACCACTCC ACTAAATCCT GAAGATGGGT 

15 101 TTATTGGGGA GGGCAATACA AATACTTTTT CTCCGAAATC TACAACGGAT 

151 GCTGCAGGAA CTACCTACTC TCTCACAGGA GAGGTTCTGT ATATAGATCC 

201 GGGGAAAGGT GGTTCAATTA CAGGAACTTG CTTTGTAGAA ACTGCTGGCG 

251 ATCTTACATT TTTAGGTAAT GGAAATACCC TAAAGTTCCT GTCGGTAGAT 

301 GCAGGTGCTA ATATCGCGGT TGCTCATGTA CAAGGAAGTA AGAATTTAAG 

20 351 CTTCACAGAT TTCCTTTCTC TGGTGATCAC AGAATCTCCA AAATCCGCTG 

401 TTACTACAGG AAAAGGTAGC CTAGTCAGTT TAGGTGCAGT CCAACTGCAA 

451 GATATAAACA CTCTAGTTCT TACAAGCAAT GCCTCTGTCG AAGATGGTGG 

501 CGTGATTAAA GGAAACTCCT GCTTGATTCA GGGAATCAAA AATAGTGCGA 

551 TTTTTGGACA AAATACATCT TCGAAAAAAG GAGGGGCGAT CTCCACGACT 

25 601 CAAGGACTTA CCATAGAGAA TAACTTAGGG ACGCTAAAGT TCAATGAAAA 

651 CAAAGCAGTG ACCTCAGGAG GCGC CTTAGA TTTAGGAGCC GCGTCTACAT 

701 TCACTGCGAA CCATGAGTTG ATATTTTCAC AAAATAAGAC TTCTGGGAAT 

751 GCTGCAAATG GCGGAGCCAT AAATTGCTCA GGGGACCTTA CATTTACTGA 

801 TAACACTTCT TTGTTACTTC AAGAAAATAG CACAATGCAG GATGGTGGAG 

30 851 CTTTGTGTAG CACAGGAACC ATAAGCATTA CCGGTAGTGA TTCTATCAAT 

901 GTGATAGGAA ATACTTCAGG ACAAAAAGGA GGAGCGATTT CTGCAGC TTC 

951 TCTCAAGATT TTGGGAGGGC AGGGAGGCGC TCTCTTTTCT AATAACGTAG 

1001 TGACTCATGC CACCCCTCTA GGAGGTGCCA TTTTTATCAA CACAGGAGGA 

1051 TCCTTGCAGC TCTTCACTCA AGGAGGGGAT ATCGTATTCG AGGGGAATCA 

35 1101 GGTCACTACA ACAGCTCCAA ATGCTACCAC TAAGAGAAAT GTAATTCACC 

1151 TCGAGAGCAC CGCGAAGTGG ACGGGACTTG CTGCAAGTCA AGGTAACGCT 

1201 ATCTATTTCT ATGATCCCAT TACCACCAAC GATACGGGAG CAAGCGATAA 

1251 CTTACGTATC AATGAGGTCA GTGCAAATCA AAAGCTCTCG GGATCTATAG 

1301 TATTTTCTGG AGAGAGATTG TCGACAGCAG AAGCTATAGC TGAAAATCTT 

40 1351 ACTTCGAGGA TCAACCAGCC TGTCACTTTA GTAGAGGGGA GCTTAGTACT 

1401 TAAACAGGGA GTGACCTTGA TCACACAAGG ATTCTCGCAG GAGCCAGAAT 

1451 CCACGCTTCT TTTGGATCTG GGGACCTCAT TATAA 

The PSORT algorithm predicts outer membrane (0.935). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 67A). The 
45 recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
67B) and for FACS analysis. 

These experiments show that cp0018 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 68 

50 The following ^pneumoniae protein (pid 4376262) was expressed <SEQ ID 135; cp6262>: 

1 MRKLRILAIV LIALSIILIA GGVVXiLTVA I PGLSSVISSF AGMGACALGC 

51 VMLALGIDVL LKKREVPIVL ASVTTTPGTG SPRSGISISG ADSTIRSLPT 

101 YLLDEGHPQS MRKLRILAIV LIVFSIILIA SGWLLTVAI PGLSSVISSP 

151 AGMGACALGC VMLALGIDVL LKKREVPIVL ASVTTTPGTG SPRSGISISG 

55 201 ADSTIRSLPT YPLDEGHPQS MRKLRILAIV LIVFSIILIA SGWLLTVAI 

251 PGLSSIISSP AEMGACALGC VMLALGIDVL LKKREVPIW PAPIPEEWI 



BNSDOCID: <WO Q202606A2_I_> 



WO 02/02606 



PCT/IB01/01445 



-108- 



10 



301 DDIDEESIRL QQEABAALAR 

351 GLEEKTKHQI RWRSSLKAM 

401 TLVERKILTE QLERNNLtRKA 

451 ICRFTIIFEN HEHGVAKSJjIj 

501 ILHGNPFFSL EDNKKTIMKE 

551 KKWDLSGIPC RDALSEISRD 

601 NQKBLEKAEQ EYISSWERVK 

651 QETVTPTVQG TTASSDLTDI 

701 WEVKQEYGPK KKEFQDQMGS 

751 NKKEVQYAKF RLKVLESDLE 

801 VFKGSLCCAL ASKAKPYFEE 

851 RFSNLENDIA EERRLLKESK 

901 GTPESEKVYF SMYLNYYNEE 

951 ALLQEELSIQ APSE* 



LPEEMSAFEG 
VPEFLDIRRI 
FSYLYQDSIF 
HKNAVLiLEKV 
HAEMLESLSS 
EQWQKKAHLK 
KFEIERVQER 
LGRIEVSSRE 
LERFFTEHXE 
GILAQTESAE 
DPRFQDSDTQ 
QTFERAGLGV 
KRRAKTRLVE 



YIKWESHLE 
FEJSEEFFFLS 
KKIIDNFEKL 
IYRSLQKSYR 
YRKVFLALSD 
HQESLYTQAR 
IRAIQKLYPN 
DNQNQESCVK 
ELEVLQKDYS 
SLLTQEELPI 
LRALTLRLQE 
LREXAVESTY 
MTQRYRDFKM 



NMKSLPYDGH 
ARKRLIDLAT 
AWKFMILSKS 
DIGMSSAKMK 
ENWDTPSDP 
DRLTDQSSKE 
ILEREEETTG 
VIiRSHEVEMS 
KHLSYFKKVN 
LATRGAIiEKA 
AKASLEEEIK 
DLRSLTNTWE 
ALEAMQFNEE 



15 A predicted signal peptide is highlighted. 



The cp6262 nucleotide sequence <SEQ ID 136> is: 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



l 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 



ATGAGGAAAC 
TTTGATTGCA 
GTTCAGTCAT 
GTGATGCTTG 
TATAGTTCTC 
GTGGTATTTC 
TATCTCTTGG 
TGCGATCGTT 
TATTGCTTAC 
GCAGGGATGG 
CGATGTTCTT 
CTACGACACC 
GCTGATAGCA 
TCCACAATCC 
TTAGCATTAT 
CCTGGATTAA 
TTTGGGATGT 
GAGAAGTCCC 
GATGATATAG 
TTTAGCAAGA 
TTGTCGAGAG 
GGGCTAGAAG 
GAAGGCTATG 
AAGAGTTCTT 
ACTTTAGTAG 
AAGGAAAGCG 
TTGATAACTT 
ATTTGTCGAT 
GAGCCTGTTA 
GTTTGCAAAA 
ATCTTGCACG 
AATGAAAGAA 
TATTTTTAGC 
AAGAAATGGG 
TTCTCGTGAT 
CCCTCTATAC 
AATCAGAAAG 
ACGGGTTAAA 
TTCAAAAGCT 
CAGGAGACTG 
AACAGATATT 
ATCAAGAGTC 
TGGGAAGTCA 
AATGGGTTCT 
TATTACAGAA 
AATAAGAAAG 
AGATTTAGAA 
CTCAAGAAGA 
GTTTTCAAAG 



TTCGTATTCT 
GGTGGTGTGG 
TTCTTCCCCG 
CTTTAGGGAT 
GCATCTGTAA 
TATTTCAGGA 
ACGAGGGACA 
CTCATAGTTT 
TGTAGCGATC 
GTGCCTGTGC 
CTGAAGAAAC 
AGGAACTGGC 
CCATACGTTC 
ATGAGGAAAC 
TTTGATTGCA 
GCTCGATCAT 
GTGATGCTTG 
TATAGTAGTT 
ATGAAGAGAG 
CTTCCTGAGG 
TCATTTGGAG 
AGAAAACGAA 
GTTCCAGAAT 
TTTTCTCTCA 
AGAGAAAAAT 
TTTTCTTATT 
CGAGAAGTTA 
TTACAATTAT 
CACAAGAATG 
AAGCTATAGA 
GCAACCCTTT 
CACGCAGAGA 
TCTATCTGAT 
ATTTGTCAGG 
GAACAGTGGC 
GCAAGCTAGG 
AGTTAGAGAA 
AAATTTGAGA 
TTATCCTAAT 
TGACTCCAAC 
TTAGGAAGAA 
TTGTGTAAAA 
AACAAGAGTA 
TTAGAGAGGT 
GGACTACTCT 
AGGTTCAATA 
GGGATTCTAG 
ACTTCCGATT 
GGAGTCTATG 



TGCGATCGTT 
TATTGCTTAC 
GCAGGGATGG 
CGATGTTCTT 
CTACGACACC 
GCTGATAGCA 
TCCACAATCC 
TTAGCATTAT 
CCTGGATTAA 
TTTGGGATGT 
GAGAAGTCCC 
AGCCCTAGAA 
TCTTCCTACG 
TTCGTATTCT 
AGTGGTGTGG 
TTCTTCCCCA 
CTTTGGGGAT 
CCCGCACCTA 
TATACGGCTG 
AGATGAGTGC 
AACATGAAAA 
ACATCAGATA 
TTTTAGATAT 
GCTCGCAAAC 
TTTAACAGAG 
TATATCAGGA 
GCATGGAAAT 
TTTTGAAAAT 
CAGTGTTACT 
GATATAGGCA 
TTTCTCTTTG 
TGCTTGAAAG 
GAGAACGTTG 
AATCCCCTGT 
AGAAGAAAGC 
GATCGTTTAA 
AGCTGAACAA 
TTGAGAGAGT 
ATC CTCGAGA 
TGTTCAAGGG 
TAGAGGTCTC 
GTCTTAAGAA 
TGGCCCTAAG 
TTTTTACAGA 
AAACACTTGT 
TGCGAAGTTT 
CTCAGACTGA 
CTTGCAACTC 
TTGCGCGCTA 



CTCATAGCTT 
TGTAGCGATC 
GTGCCTGTGC 
CTGAAGAAAC 
AGGAACTGGC 
CCATACGTTC 
ATGAGGAAAC 
TTTGATTGCA 
GTTCAGTCAT 
GTGATGCTTG 
TATAGTTCTC 
GTGGTATTTC 
TATCCCTTGG 
TGCGATCGTT 
TATTGCTTAC 
GCGGAGATGG 
CGACGTTCTT 
TTC CTGAAGA 
CAGCAGGAAG 
ATTTGAAGGT 
GCCTGCCTTA 
AGAGTCGTCA 
CAGAAGAATT 
GACTTATAGA 
CAACTTGAGC 
C TCAATTTTT 
TTATGATTTT 
CATGAACATG 
GGAGAAGGTA 
TGTCATCTGC 
GAAGATAATA 
TCTCAGTAGC 
TAGATACACC 
AGGGACGCGT 
ACATCTAAAG 
CAGACCAGAG 
GAGTACATAT 
ACAGGAGAGG 
GAGAAGAAGA 
ACGACGGCTT 
CAGTAGGGAG 
GTCATGAGGT 
AAAAAAGAAT 
GCATATTGAA 
CTTATTTTAA 
AGGTTGAAGG 
GAGTGCTGAG 
GGGGAGCCTT 
GCAAGCAAAG 



TGAGCAT TAT 
CCTGGATTAA 
TTTGGGATGT 
GAGAAGTCCC 
AGCCCTAGAA 
TCTTCCTACG 
TTCGTATTCT 
AGTGGTGTGG 
TTCTTCCCCG 
CTTTAGGGAT 
GCATCTGTAA 
TATTTCAGGA 
ACGAGGGACA 
CTCATAGTTT 
TGTAGCGATC 
GTGCTTGTGC 
CTGAAGAAAC 
AGTCGTCATA 
CTGAAGCCGC 
TACATAAAAG 
TGATGGTCAT 
GATCTTCTTT 
TTTGAAGAAG 
TTTAGCTACT 
GCAATAATTT 
AAAAAAATTA 
GAGTAAATCA 
GTGTAGCAAA 
ATCTATAGGA 
AAAGATGAAA 
AAAAGACGAT 
TATAGGAAGG 
TAGCGATCCA 
TGTCTGAGAT 
CATCAAGAGT 
CTCTAAAGAA 
CTTCTTGGGA 
ATACGGGCAA 
AACCACAGGT 
CATCCGATTT 
GATAATCAGA 
AGAAATGAGC 
TTCAGGATCA 
GAGTTAGAAG 
AAAAGTAAAC 
TTTTAGAGTC 
AGTCTGTTAA 
AGAGAAAGCT 
CAAAACCCTA 
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2451 
2501 
2551 
2601 
2651 
2701 
2751 
2801 
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TTTTGAAGAG GATCCCAGAT TCCAAGATTC TGATACGCAA TTGCGAGCTC 
TGACTCTAAG GTTACAGGAG GCTAAGGCAA GCCTGGAAGA AGAGATAAAG 
AGATTTTCAA ATCTTGAGAA CGATATTGCA GAGGAAAGAC GCCTTCTTAA 
AGAGAGCAAG CAGACGTTCG AAAGAGCAGG TTTAGGGGTT CTCCGAGAAA 
TTGCAGTCGA GTCTACTTAT GATTTGCGTT CCTTAACAAA TACATGGGAA 
GGGACCCCAG AGAGTGAGAA GGTCTATTTT AGCATGTATC TTAATTATTA 
CAACGAAGAG AAACGTAGGG CTAAAACAAG ATTGGTTGAA ATGACACAGA 
^ A ^ GA ^AAAATG GCCTTGGAAG CTATGCAGTT TAATGAAGAA 
2851 GCCCTTTTGC AAGAGGAACT CTCTATTCAA GCTCCCAGTG AATAA 

10 The PSORT algorithm predicts inner membrane (0.660). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 68A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
68B) and for FACS analysis. 

These experiments show that cp6262 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 69 

The following Cpneumoniae protein (pid 4376269) was expressed <SEQ ID 137; cp6269>: 

1 MYQENLRLliE RLLYNSVQKS YADRLFSYEK TKMVHDTPLI PWEEDKEKCA 

20 -.ni ^^ E £ Q KILLDYGKSI FWLNENDEIN LNDPWSWGLN TVRTRKVFQE 

" VDDSERWNHK VLIQKLEDDY EKLLEESSKE STEAtJKKLLS DLTORLEDAK 

151 TKFFLKKQEE VETRVKDLRA RYGGTVDPKQ DTEAKKKVEL EASIiETFLDS 

201 IESELVQCLE DQDIYWKEQD VKDLARTQEIi EEQDIEAKRE EAAEDLRSLN 

" f!™ 3KTML DRA KWHIENA EDSITWWTSQ IEMKDMKARL K1LKEDITSV 

25 I™ LPEIDEIETC IiSLEELPLLT TRELLTKSYL KFK1CSETLL KMTSVFENNI 

351 YVQEYEVQLQ NLGFKLQGIS QRFGKKQDDF ANLEEQVALQ KKRLREI/TQN 

401 FEIQGFNFMK EDFKAAAKDL YIRSTAEQKM NFDVPCMELF RRYHEEVNKP 

451 LLELMYNCAD SYRDAKKKLC SLRLDEKELL QKEIKKEEFY QKKQORHADR 

501 SRHTTYQKLR IAEELALELK KK1* UKKQQRHADR 

The cp6269 nucleotide sequence <SEQ ID 138> is: 

30 1 ATGTACCAGG AGAATCTAAG ATTGTTGGAA AGGCTTCTTT ATAATAGTGT 

51 TCAAAAGAGC TATGCGGATC GGCTGTTTTC CTATGAAAAG ACAAAGATGG 
101 TGCACGATAC TCCGCTGATT CCTTGGGAAG AGGATAAGGA AAAATGTGCT 
151 GAAGCTGAGA AAGCTTTCTT AGAGCAACAG AAGATTCTCC TAGATTATGG 
201 AAAATCTATC TTTTGGCTGA ATGAGAACGA TGAGATCAAT TTAAACGATC 
^ "1 CTTGGAGTTG GGGTCTTAAT ACGGTGAGGA CTAGGAAAGT ATTCCAAGAG 

301 GTTGACGACA GTGAACGTTG GAATCATAAG GTACTCATTC AAAAACTCGA 
351 GGACGATTAT GAGAAACTTC TAGAGGAAAG TTCAAAAGAG TCTACTGAAG 
CAAA.TAAGAA GCTTTTATCT GACTTAGTAG ATCGTCTTGA AGATGCTAAG 
ACAAAATTTT TCCTGAAGAA ACAGGAGGAG GTGGAGACTC GCGTTAAGGA 
TCTTAGAGCT CGATATGGAG GCACAGTAGA TCCTAAGCAG GATACGGAAG 
CTAAGAAGAA AGTCGAATTG GAGGCTAGCT TAGAAACCTT TTTAGATTCC 
601 ATCGAATCAG AGCTAGTACA GTGTTTAGAA GATCAAGATA TATATTGGAA 
651 AGAACAGGAT GTCAAAGATC TAGCACGTAC GCAAGAGCTC GAGGAACAAG 
A* l* 1 ATATTGAAGC GAAGAGGGAA GAAGCTGCCG AAGACCTAAG AAGTCTTAAT 

43 I 5 } GAGCGTTTAA AGAAGTCAAA AACTATGTTA GATAGGGCTA AATGGCATAT 

lit ^^^ CT GAGGACAGTA TTACCTGGTG GACTAGTCAG ATAGAAATGA 
851 AGGATATGAA AGCAAGACTG AAGATCTTAA AAGAAGATAT AACAAGTGTT 
^^r^ TAGATGAGAT TGAAACGTGT TTAAGCTTAG AGGAGCTTCC 
50 ill, !Z™ ACG ACCAGGGAA C TCTTAACTAA GTCCTACCTA AAGTTTAAGA 

50 1°°} TTTGTTCGGA AACACTATTA AAAATGACTT CTGTGTTTGA GAACAATATC 

Jini ^S TTCAGG AGTACGAGGT TCAGCTGCAA AATCTAGGGT TTAAGTTACA 
1101 AGGTATATCT CAGAGATTCG GAAAGAAACA AGACGATTTT GCGAATCTAG 
1151 AGGAACAGGT TGCTTTGCAA AAGAAACGAC TCAGAGAGCT CACTCAGAAT 
"01 TTTGAAATAC AAGGATTCAA TTTCATGAAA GAAGATTTA AGGCAGCCGC 
! AAAGATCTT TATATAAG AA GTACAGCTGA ACAAAAGATG AACTTTGATG 
1301 TGCCTTGCAT GGAGCTCTTC CGTAGGTATC ATGAGGAGGT CAACAAGCCG 
1351 CTTCTTGAGT TGATGTACAA TTGTGCAGAC AGT T A T AG*G ATOCTaS 



401 
451 

40 soi 

551 
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1401 AAAGCTTTGC TCTCTACGTC TTGATGAAAA AGAGTTATTA CAAAAAGAAA 

1451 TCAAGAAAGA GGAATTTTAT CAAAAGAAAC AACAAAGGCA TGCAGATAGA 

1501 TCACGTCATA CTACGTATCA AAAGCTACGA ATTGCTGAAG AGCTTGCTCT 

1551 TGAGCTGAAG AAGAAAATCT AA 

The PSORT algorithm predicts cytoplasmic location (0.412). 

The protein was expressed in Rcoli and purified as a GST-fusion product (Figure 69A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
69B) and for FACS analysis. 

These experiments show that cp6269 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 70 

The following Cpneumoniae protein (pid 4376270) was expressed <SEQ ID 139; cp6270>: 

SLVPTLSMSN LLGAATTEEL SASNSFDGTT STTSFSSKTS 



15 



20 



25 



1 


MKIFLRFLLI 


51 


SATDGTNYVF 


101 


FSNIDATTAS 


151 


KGNLSLLDND 


201 


GGAIHTKNLT 


251 


IIFEGNTIGA 


301 


SVADALNINS 


351 


FKNGTWLKG 


401 


XDSLKNGKKI 


451 


ILELDAGKDI 


501 


SFNPTAEQEA 


551 


NVLHRSGREN 


601 


MNTNFAKTYA 


651 


YGQLSYGHTD 


701 


SGRGFFQEYT 


751 


LEKRFAEQYY 


801 


IVQASGFRSL 



30 A predicted signal peptide is highlighted. 

The cp6270 nucleotide sequence <SEQ ID 140> is: 

1 ATGAAGATTC CACTCCGCTT TTTATTGATA TCATTAGTAC CTACGCTTTC 

51 TATGTCGAAT TTATTAGGAG CTGCTACTAC CGAAGAGTTA TCGGCTAGCA 

o<r 101 ATAGCTTCGA TGGAACTACA TCAACAACAA GCTTTTCTAG TAAAACATCA 

^ D 151 TCGGCTACAG ATGGCACCAA TTATGTTTTT AAAGATTCTG TAGTTATAGA 

201 AAATGTACCC AAAACAGGGG AAACTCAGTC TACTAGTTGT TTTAAAAATG 

251 AGGCTGCAGC TGGAGATCTA AATTTCTTAG GAGGGGGATT TTCTTTCACA 

301 TTTAGCAATA TCGATGCAAC CACGGCTTCT GGAGCTGCTA TTGGAAGTGA 

An 351 AGCAGCTAAT AAGACAGTCA CGTTATCAGG ATTTTCGGCA CTTTCTTTTC 

W - 401 TTAAATCCCC AGCAAGTACA GTGACTAATG GAT TGGGAGC TATCAATGTT 

451 AAAGGGAATT TAAGCCTATT GGATAATGAT AAGGTATTGA TTCAGGACAA 

501 TTTCTCAACA GGAGATGGCG GAGCAATTAA TTGTGCAGGC TCCTTGAAGA 

551 TCGCAAACAA TAAGTCCCTT TCTTTTATTG GAAATAGTTC TTCAACACGT 

A * 601 GGCGGAGCGA TTCATACCAA AAACCTCACA CTATCTTCTG GTGGGGAAAC 

^° 651 TCTATTTCAG GGGAATACAG CGCCTACGGC TGCTGGTAAA GGAGGTGCTA 

701 TCGCGATTGC AGACTCTGGC ACCCTATCCA TTTCTGGAGA CAGTGGCGAC 

751 ATTATCTTTG AAGGCAATAC GATAGGAGCT ACAGGAACCG TCTCTCATAG 

801 TGCTATTGAT TTAGGAACTA GCGCTAAGAT AACTGCGTTA CGTGCTGCGC 

<r n 851 AAGGACATAC GATATACTTT TATGATCCGA TTACTGTAAC AGGATCGACA 

DU 901 TCTGTTGCTG ATGCTCTCAA TATTAATAGC CCTGATACTG GAGATAACAA 

951 AGAGTATACG GGAAC CATAG TCTTTTCTGG AGAGAAGCTC ACGGAGGCAG 

1001 AAGCTAAAGA TGAGAAGAAC CGCACTTCTA AATTACTTCA AAATGTTGCT 

1051 TTTAAAAATG GGACTGTAGT TTTAAAAGGT GATGTCGTTT TAAGTGCGAA 

«■ 1101 CGGTTTCTCT CAGGATGCAA ACTCTAAGTT GATTATQGAT TTAGGGACGT 

DD 1151 CGTTGGTTGC AAACACCGAA AGTATCGAGT TAACGAATTT GGAAATTAAT 

1201 ATAGACTCTC TCAGGAACGG GAAAAAGATA AAACTCAGTG CTGCCACAGC 
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1251 TCAGAAAGAT ATTCGTATAG ATCGTCCTGT TGTACTGGCA ATTAGCGATG 

1301 AGAGTTTTTA TCAAAATGGC TTTTTGAATG AGGACCATTC CTATGATGGG 

1351 ATTCTTGAGT TAGATGCTGG GAAAGACATC GTGATTTCTG CAGATOCTCG 

1401 CAGTATAGAT GCTGTACAAT CTCCGTATGG CTATCAGGGA AAGTGGACGA 

-> 1451 TCAATTGGTC TACTGATGAT AAGAAAGCTA CGGTTTCTTG GGCGAAGCAG 

1501 AGTTTTAATC CCACTGCTGA GCAGGAGGCT CCGTTAGTTC CTAATCTTCT 

1551 TTGGGGTTCT TTTATAGATG TTCGTTCCTT CCAGAATTTT ATAGAGCTAG 

1601 GTACTGAAGG TGCTCCTTAC GAAAAGAGAT TTTGGGTTGC AGGCATTTCC 

in 1651 AATGTTTTGC ATAGGAGCGG TCGTGAAAAT CAAAGGAAAT TCCGTCATGT 

iU 1701 GAGTGGAGGT GCTGTAGTAG GTGCTAGCAC GAGGATGCCG GGTGGTGATA 

1751 CCTTGTCTCT GGGTTTTGCT CAGCTCTTTG CGCGTGACAA AGACTACTOT 

1801 ATGAATACCA ATTTCGCAAA GACCTACGCA GGATCTTTAC GTTTGCAGCA 

1851 CGATGCTTCC CTATACTCTG TGGTGAGTAT CCTTTTAGGA GAGGGAGGAC 

ie 1901 TCCGCGAGAT CCTGTTGCCT TATGTTTCCA AGACTCTGCC GTGCTCTTTC 

lD 1951 TATGGGCAGC TTAGCTACGG CCATACGGAT CATCGCATGA AGACCGAGTC 

2001 TCTACCCCCC CCCCCCCCGA CGCTCTCGAC GGATCATACT TCTTGGGGAG 

2051 GATATGTCTG GGCTGGAGAG CTGGGAACTC GAGTTGCTGT TGAAAATACC 

2101 AGCGGCAGAG GATTTTTCCA AGAGTACACT CCATTTGTAA AAGTCCAAGC 

on 2151 TGTTTACGCT CGCCAAGATA GCTTTGTAGA ACTAGGAGCT ATCAGTCGTG 

ZU 2201 ATTTTAGTGA TTCGCATCTT TATAACCTTG CGATTCCTCT TGGAATCAAG 

2251 TTAGAGAAAC GGTTTGCAGA GCAATATTAT CATCTTGTAG CGATGTATTC 

2301 TCCAGATGTT TGTCGTAGTA ACCCCAAATG TACGACTACC CTACTTTCCA 

2351 AC CAAGGGAG TTGGAAGACC AAAGGTTCGA AOTTAGCAAG ACAGGCTGGT 

9 * 2401 ATTGTTCAGG CCTCAGGTTT TCGATCTTTG GGAGCTGCAG CAGAGCTTTT 

ZD 2451 CGGGAACTTT GGCTTTGAAT GGCGGGGATC TTCTCGTAGC TATAATGTAG 

2501 ATGCGGGTAG CAAAATCAAA TTTTAG 

The PSORT algorithm predicts outer membrane (0.92). 

The protein was expressed in Exoli and purified as a GST-fiision product (Figure 70A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot and for 
30 FACS analysis (Figure 70B). 

The cp6270 protein was also identified in the 2D-PAGE experiment (Cpn0013). 

These experiments show that cp6270 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 71 

35 The following C.pneumoniae protein (pid 4376402) was expressed <SEQ ID 141; cp6402>: 

1 MPJVADLLSHL ETIiLSSKIFQ DYGPNGLQVG DPQTPVKKIA VAVTADLETI 

51 KQAVAAEANV LIVHHGIPWK GMPYPITGMI HKRIQLLIEH NIQLIAYHLP 

101 LDAHPTLGNN WRVALDLWWH DLKPFGSSLP YLGVQGSFSP IDIDSFIDLL 

An 151 SQYYQAPLKG SALGGPSRVS SAALISGGAY RELSSAATSQ VDCFITGNFD 

4U EPAWSTALES NINFLAFGHT ATEKVGPKSL AEHLKSEFPI STTFIDTANP 

The cp6402 nucleotide sequence <SEQ ID 142> is: 

45 



50 



55 



1 


ATGAATGTTG 


51 


AATATTTCAG 


101 


CTCCGGTAAA 


151 


AAACAAGCTG 


201 


TTTTTGGAAA 


251 


TCCAATTACT 


301 


TTGGATGCTC 


351 


AAATTGGCAT 


401 


TGCAAGGCTC 


451 


TCTCAATATT 


501 


TAGAGTCTCC 


551 


CTTCGGCAGC 


601 


GAACCTGCAT 


651 


TGGACATACA 
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701 TAAAAAGCGA ATTTCCTATT TCCACAACCT TTATAGATAC GGCCAACCCC 
751 TTCTAA 

The PSORT algorithm predicts cytoplasmic (0.158). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 71A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
7 IB) and for FACS analysis. 

These experiments show that c P 6402 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 72 

The following C.pneumoniae protein (pid 4376520) was expressed <SEQ ID 143; cp6520>: 

1 MKHYLSFSPS ADFFSKQGAI ETQVLFGERV LVKGSTCYAY SQLFHNEIjIiW 

51 KPYPGHSFRS TLVPCTPEFH IHPNVSWSV DAFLDPWGIP LPFGTLLHVN 

101 SQNTVIFPKD ILNHMNTIWG SGTPQCDPRH LRRLNYNFFA EIjIjIKDADLL 

ic H 1 LNF P YVWGGR SVHESLEKPG VDCSGFINIL YQAQGYNVPR NAADQYADCH 

201 WISSFENLPS GGLIFLYPKE EKRISHVMLK QDSSTLIHAS GGGKKVEYFI 

251 LEQDGKFLDS TYLFFRNNQR GRAFFGIPRK RKAFL* 

The cp6520 nucleotide sequence <SEQ ID 144> is: 

1 ATGAAACACT ACCTATCATT TTCTCCTTCT GCTGATTTTT TCTCTAAACA 

0n 51 GGGTGCTATT GAAACTCAAG TCCTTTTTGG AGAGCGCGTC TTAGTCAAAG 

M 101 GGAGCACCTG CTATGCATAT TCCCAATTAT TCCACAATGA GCTGTTATGG 

151 AAGCC CTATC CAGGTCATAG CTTTCGTTCT ACCCTAGTCC CCTGCACTCC 

2 01 TGAATTTCAT ATCCATCCAA ATGTTTCTGT GGTTTCTGTG GATGCATTTT 

251 TAGATCCTTG GGGGATCCCT CTTCCTTTTG GAACTTTACT CCATGTGAAT 

9S 30 1 TCTCAAAATA CCGTTATTTT CCCTAAGGAT ATTCTCAATC ATATGAACAC 

ZD 351 CATCTGGGGC TCCGGCACAC CTCAATGCGA TCCTAGACAT CTACGTCGTC 

401 TAAATTATAA CTTCTOTGCT GAACTTTTAA TTAAAGACGC AGACCTTTTA 

451 CTGAACTTTC CCTATGTATG GGGAGGACGG TCTGTACACG AAAGTCTGGA 

501 AAAGCCGGGT GTTGATTGTT CGGGATTTAT CAATATCCTT TACCAGGCAC 

o n l Sl ; AGGGATACAA CGTCCCTAGA AACGCTGCAG ATCAATATGC GGATTGTCAT 

JU 601 TGGATCTCTA GCTTTGAGAA CCTTCCTTCT GGTGGGTTAA TATTTCTTTA 

651 CCCTAAAGAA GAAAAGCGTA TTTCTCATGT TATGTTGAAA CAGGATAGTT 

701 CCACCCTCAT TCATGCTTCT GGTGGAGGGA AAAAAGTGGA GTATTTCATT 

751 TTAGAACAAG ATGGGAAGTT TTTAGATTCG ACTTATCTAT TTTTTAGAAA 

« f?J TAATCAGAGG GGACGGGCAT TTTTTGGGAT CCCTAGAAAA AGAAAAGCCT 

3D 851 TTCTGTAA 

The PSORT algorithm predicts cytoplasmic (0.265). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 72A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
72B) and for FACS analysis. 

These experiments show that cp6520 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



40 



Example 73 

The following C.pneumoniae protein (pid 43 76567) was expressed <SEQ ID 145; cp6567>: 

,c * MTSPIPFQSS GDASFLAEQP QQLPSTSESQ LVTQLLTMMK HTQALSETVL 

^ OQQRDRLPTA SIILQVGGAP TGGAGAPFQP GP ADDHHHP I PPPVVPAQIE 

101 TEITTIRSEL QLMRSTLQQS TKGARTGVLV VTAILMTISL LAXIIULAV 

151 LGFTGVLPQV ALLMQGETNT-i IWAMVSGSII CFIALIGTLG LILTNKNTPL 
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201 PAS* 

The cp6567 nucleotide sequence <SEQ ID 146> is; 

1 ATGACC TCAC CGATCCCCTT TCAGTCTAGT GGCGATGCCT CTTTCCTTGC 

51 CGAGCAGCCA CAGCAACTCC CGTCTACTTC TGAATCTCAG CTAGTAACTC 

101 AATTGCTAAC CATGATGAAG CATACTCAAG CATTATCCGA AACGGTTCTT 

151 CAACAACAAC GCGATCGATT ACCAACCGCA TCTATTATCC TTCAAGTAGG 

201 AGGAGCTCCT ACAGGAGGAG CGGGTGCGCC TTTTCAACCA GGACCGGCAG 

251 ATGATCATCA TCATCCCATA CCGCCGCCTG TTGTACCAGC TCAAATAGAA 

301 ACAGAAATCA CCACTATAAG ATCCGAGTTA CAGCTCATGC GATCTACTCT 

351 ACAACAAAGC ACAAAAGGAG CTCGTACAGG AGTTCTAGTG GTTACTGCAA 

401 TCTTAATGAC GATCTCCTTA TTGGCTATTA TTATCATAAT ACTAGCTGTG 

451 CTTGGATTTA CGGGCGTCTT GCCTCAAGTA GCTTTATTGA TGCAGGGTGA 

501 AACAAATCTG ATTTGGGCTA TGGTGAGCGG TTCTATTATT TGCTTTATTG 

551 CGCTAATTGG AACTCTAGGA TTAATTTTAA CAAATAAGAA CACGCCTCTA 

10 601 CCGGCTTCOT AA 

The PSORT algorithm predicts inner membrane (0.694). 

The protein was expressed in Rcoli and purified as a GST-fusion product (Figure 73A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
73B) and for FACS analysis. 

20 These experiments show that cp6567 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 74 

The following ^pneumoniae protein (pid 4376576) was expressed <SEQ ID 147; cp6576>: 

0 c 1 MLIMRNKVIL QISILALIQT PLTLF ST EKV KEGHWVDSI TIITEGENAS 

^ 51 NKHPLPKLKT RSGALFSQLD FDEDLRILAK EYDSVEPKVB FSEGKTJJIAL 

101 HLIAKPSIRN IHISGWQWP EHKILKTLQI YRNDLFEREK FLKGLDDLRT 

151 YYLKRGYFAS SVDYSLEHNQ EKGHIDVLIK INEGPCGKIK QLTFSQISRS 

201 EKSDIQEFIQ TKQHSTTTSW FTGAGLYHPD IVEQDSLAIT NYLHNNGYAD 

o n 251 AIVNSHTOLD DKGNILLYMD IDRGSRYTLG HVHIQGFEVL PKRLIEKQSQ 

M 301 VGPWDLYCPD KIWDGAHKIK QTYAKYGYXN TNVDVLFIPH ATRPIYDVTY 

3 51 EVSEGSPYKV Gh XK I TGKTH TKSDVILHET SLFPGDTFNR LKLEDTEQRL 

401 RNTGYFQSVS VYTVRSQLDP MGNADQYRDI FVEVKETTTG NLGLFLGFSS 

451 LDHLFQGIEt, SESNFDLFGA RNIFSKGFRC LRGGGEHLFL KANFGDKVTD 

„ 501 YTLKWTKPHF LNTPWILGIE LDKSINRALS KDYAVQTYGG NVSOTYILNE 

M 551 HLKYGLFYRG SQTSLHEKRK FLLGPNIDSN KGFVSAAGVN LNYDSVDSPR 

601 TPTTGIRGGV TFEVSGLGGT YHFTKLSLNS SIYRKLTRKG ILKIKGEAQF 

651 IKPYSNTTAE GVPVSERFFL GGETTVRGYK SFIIGPKYSA TEPQGGLSSL 

701 LISEEFQYPIj IRQPNISAFV FLDSGFVGLQ EYKISLKDLR SSAGFGLRFD 

751 VMNNVPVMLG FGWFFRPTET LNGEKIDVSQ RFFFALGGMF * 

40 A predicted signal peptide is highlighted 



The cp6576 nucleotide sequence <SEQ ID 148> is: 

1 ATGCTC! ATOa TOOrta aaniaa a n.rnrn * m^n 



ATGCTCATCA TGCGAAATAA AGTTATCTTG CAAATATCTA TTCTAGCGTT 

51 AATCCAAACC CCTTTAACTT TATTTTCTAC TGAAAAAGTT AAAGAAGGCC 

101 ATGTGGTGGT AGACTCTATC ACAATCATAA CGGAAGGAGA AAATGCTTCA 

^ 151 AATAAACATC CCTTACCCAA ATTAAAGACC AGAAGTGGGG CTCTTTTTTC 

201 TCAATTAGAT TTTGATGAAG ACTTGAGAAT TCTAGCTAAA GAATACGACT 

251 CTGTTGAGCC TAAAGTAGAA TTTTCTGAAG GGAAAACTAA CATAGCCCTT 

301 C AC CTAATAG CTAAACCCTC AATTCGAAAT ATTCATATCT CAGGAAATCA 

3 51 AGTCGTTCCT GAACATAAAA TTCTTAAAAC CCTACAAATT TACCGTAATG 
ATCTCTTTGA ACGAGAAAAA TTTCTTAAGG GTC1TGATGA TCTAAGAACG 

451 TATTATCTCA AGCGAGGATA TTTCGCATCC AGTGTAGACT ACAGTCTGGA 

501 ACACAATCAA GAAAAAGGTC ACATCGATGT TTTAATTAAA ATCAATGAAG 

551 GTCCTTGCGG GAAAATTAAA CAGCTTACGT TCTCAGGAAT CTCTCGATCA 

601 GAAAAATCAG ATATCCAAGA ATTTATTCAA ACCAAGCAGC AGTCTACAAC 



50 401 
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651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 



TACAAGTTGG 
AAGATAGCTT 
GCTATAGTCA 
TTACATGGAT 
TCCAAGGGTT 
GTCGGCCCCA 
TAAGATCAAA 
ACGTTCTCTT 
GAGGTAAGTG 
GAATACCCAT 
CAGGAGATAC 
AGAAATACAG 
ACTTGATCCT 
TCAAAGAAAC 
CTTGACAATC 
AOTTGGAGCT 
GTGGAGAACA 
TATACTTTGA 
AGGAATTGAA 
CTGTCCAAAC 
CACCTGAAAT 
AAAACGTAAG 
TCTCTGCTGC 
ACTCCAACTA 
GGGAGGAACT 
GAAAACTTAC 
ATTAAACCCT 
CTTCTTCCTA 
TCGGTCCAAA 
CTTATTTCAG 
TGCCTTTGTA 
TTTCGTTAAA 
GTAATGAATA 
AACCGAGACT 
TTCCTTTAGG 



TTTACTGGAG 
GGCAATTACG 
ACTCTCACTA 
ATTGATCGAG 
TGAGGTTTTG 
ATGATCTTTA 
CAAACTTATG 
CATCCCTCAC 
AAGGGTCTCC 
ACAAAATCTG 
ATTCAATCGC 
GCTACTTCCA 
ATGGGCAATG 
AACAACAGGA 
TTTTTGGAGG 

AGAAATATAT 
TCTATTCTTA 
AGTGGACCAA 
TTAGATAAAT 
CTATGGCGGG 
ACGGTCTATT 
TTCCTCCTAG 
AGGTGTCAAC 
CAGGGATTCG 
TATCATTTTA 
GCGTAAAGGT 
ATAGCAATAC 
GGTGGAGAGA 
ATACTCTGCT 
AAGAGTTTCA 
TTCTTAGACT 
AGATCTACGT 
ATGTTCCTGT 
TTGAATGGAG 
GGGCATGTTC 
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CTGGACTCTA 
AATTACCTAC 
TGACC TTGAC 
GGTCGCGATA 
CCAAAACGCC 
TTGCCCCGAT 
CAAAGTATGG 
GCAACCCGCC 
TTATAAAGTT 
ACGTTATTTT 
TTAAAGCTAG 
AAGCGTTAGT 
CGGATCAATA 
AACTTAGGCT 
AATTGAACTA 
TTTCTAAAGG 
AAAGCCAACT 
ACCTCATTTT 
CAATTAACAG 
AACGTCAGCA 
TTATCGAGGA 
GGCCAAATAT 
TTGAATTACG 
CGGGGGGGTG 
CAAAACTCTC 
ATTTTGAAAA 
TACAGCTGAA 
CTACAGTTCG 
ACAGAACCTC 
ATACCCTCTC 
CAGGTTTTGT 
AGTAGTGCTG 
TATGTTAGGA 
AAAAAATTGA 
TAA 



TCACCCAGAT 
ATAATAACGG 
GACAAAGGGA 
TACCTTAGGA 
TTATAGAAAA 
AAAATATGGG 
CTACATCAAT 
CTATTTATGA 
GGGTTAATTA 
ACACGAAACC 
AAGATACTGA 
GTCTATACAG 
CCGAGATATT 
TATTCTTAGG 
TCTGAAAGTA 
TTTTCGTTGT 
TCGGGGACAA 
CTAAACACTC 
AGCATTATCT 
CAACGTATAT 
AGTCAAACGA 
AGACAGCAAT 
ATTCTGTAGA 
ACTTTTGAGG 
TTTAAACAGC 
TCAAAGGGGA 
GGAGTTCCTG 
GGGATATAAA 
AGGGAGGACT 
ATCAGACAAC 
CGGTTTACAA 
GATTTGGTCT 
TTTGGTTGGC 
TGTATCTCAG 



ATTGTTGAAC 
GTACGCTGAT 
ATATTCTTCT 
CACGTCCATA 
GCAATCCCAA 
ATGGGGCTCA 
ACCAATGTAG 
TGTAACTTAT 
AAATTACTGG 
AGTCTCTTCC 
GCAACGTTTA 
TTCGTTCTCA 
TTTGTAGAAG 
ATTTAGTTCT 
ATTTTGATCT 
CTAAGAGGCG 
AGTCACAGAC 
CTTGGATTTT 
AAAGATTATG 
CTTGAACGAA 
GTTTACATGA 
AAAGGATTTG 
TAGTCCTAGA 
TTTCTGGTTT 
TCTATCTATA 
AGCTCAATTT 
TCAGTGAGCG 
TCCTTTATTA 
CTCTTCGCTC 
CTAATATTAG 
GAGTATAAGA 
GCGCTTCGAT 
CCTTCCGTCC 
CGATTCTTCT 



The PSORT algorithm predicts outer membrane (0.7658). 

The protein was expressed in Kcoli and purified as GST-fusion (Figure 74A), his-tag and his- 
tag/GST-fusion products. The recombinant proteins were used to immunise mice, whose sera were 
used in a Western blot (Figure 74B) and for FACS analysis (Figure 74C). 
The cp6576 protein was also identified in the 2D-PAGE experiment (Cpn0300). 

These experiments show that c P 6576 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 75 

The following Cpneumoniae protein (pid 4376607) was expressed <SEQ ID 149; cp6607>: 



l 

51 
101 
151 
201 
251 
301 
351 



MNKKQKDKLK ICVIISTLIL VGIFARA PRG 
EDMRKILCDA IEHADEEIFL RIYNLSEPKI 
FKIPQILKQA SNVTLVEQPP AGRKLMHQKA 
RLDNNLILGM HSSELCDLII TWTSGDFSIK 
LEKIQTAQKT IQVAMFALTH SEIIQALHQA 
KQLKQLNINK DFVSIUTAPC TLHHKFAVID 
ESLIIIiENLT KQQNQKLRMX WKDLAKHSEH 
EAA* 



DTFKTFLKSE 
QQSLTRQAQA 
LSIDKKDAWL 
DQTGKYFVLP 
KQRGIHVDII 
NKTLLAGSIN 
PTVDDEEKEI 



EAIIYSNQCN 
KNKVTIYYQK 
GSANYTNLSL 
QDRKIAIQAV 
IDRSHSKLTF 
WSKGRFSLWD 
IEKSLPVEEQ 



A predicted signal peptide is highlighted. 

The cp6607 nucleotide sequence <SEQ ID 150> is: 
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1 ATGAATAAAA GACAAAAAGA TAAATTAAAA ATCTGTGTTA TTATTAGCAC 

51 GTTGATTTTA GTAGGAATTT TTGCAAGAGC TCCTCGTGGT GACACTOTTA 

101 AGACTTTTTT AAAGTCTGAA GAAGCTATCA TCTACTCAAA TCAATGCAAT 

151 GAGGACATGC GTAAAATTCT ATGCGATGCT ATAGAACACG CTGATGAAGA 

J 201 GATCTTCCTA CGTATTTATA ACCTCTCAGA ACCCAAGATC CAACAGAGTT 

251 TAACTCGACA AGCTCAAGCA AAAAACAAAG TTACGATCTA CTATCAAAAA 

301 TTTAAAATTC CCCAAATCTT AAAGCAAGCC AGCAATGTAA CTTTAGTCGA 

351 GCAACCTCCA GCAGGGCGTA AACTGATGCA TCAAAAAGCT CTTOCCATAG 

10 401 ATAAGAAAGA TGCTTGGCTA GGATCTGCGA ACTACACCAA TCTTTCTCTA 

1U 451 CGTTTAGATA ATAATCTCAT TCTAGGAATG CATAGCTCGG AGCTCTGTGA 

501 TCTCATTATC ACAAATACCT CTGGAGACTT TTCTATAAAG GATCAAACAG 

551 GAAAGTATTT TGTTCTTCCT CAAGATCGTA AAATTGCAAT ACAAGCTGTA 

601 CTCGAAAAAA TCCAGACAGC TCAGAAAACC ATCCAAGTTG CTATGTTTGC 

ls 651 TCTGACCCAC TCGGAGATTA TTCAAGCCTT ACATCAAGCA AAACAACGAG 

iJ 701 GAATCCATGT AGATATTATC ATTGATAGAA GTCATAGCAA ACTTACTTTT 

751 AAGCAATTAC GACAATTAAA TATCAATAAA GACTTTGTTT CTATAAATAC 

801 CGCACCCTGT ACTCTTCACC ATAAGTTTGC AGTTATAGAT AATAAAACTC 

851 TACTTGCAGG ATCTATAAAT TGGTCTAAAG GAAGATTCTC CTTAAATGAT 

9n 901 GAAAGCTTGA TCATACTGGA AAAC CTGACC AAACAACAAA ATCAGAAACT 

^ V 951 TCGAATGATT TGGAAAGATC TAGCTAAGCA TTCAGAACAT CCTACAGTAG 

1001 ACGATGAAGA AAAAGAAATT ATAGAAAAAA GTCTTCCAGT AGAAGAGCAA 

1051 GAAGCAGCGT GA 

The PSORT algorithm predicts periplasmic (0.934). 

The protein was expressed in Kcoli and purified as a iris-tagged product (Figure 75A) and also as a 
GST-fusion. The GST-fusion protein was used to immunise mice, whose sera were used in a Western 
blot (Figure 75B) and for FACS analysis. 

These experiments show that cp6607 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 76 

30 The following C.pneumoniae protein (pid 4376624) was expressed <SEQ ID 151; cp6624>: 

1 MDAKMGYIFK VMRWIFCFVA CGITPGCTNS GFQNANSRPC ILSMNRMIHD 
51 CVERWGNRL ATAVLIKGSL DPHAYEMVKG DKDKIAGSAV IFCNGLGU3H 
101 TLSLRKHLEN NPNSVKLGER L I ARG AFVPIj EEDGICDPHI WMDLSIWKEA 



25 



151 VIEITEVLIE KFPEWSAEFK ANSEELVCEM SILDSWAKQC LSTIPENLRY 

^ 201 LVSGHHftPSY FTRRYLATPE EVASGAWRSR CISPEGLSPE AQISVRDIMA 

251 WDYINEHDV SWFPEDTLN QDALKKXVSS LKK SHIiVRLA QKPLYSDNVD 

301 DNYFSTFKHN VCLITEELGG VAIiECQR* 

The cp6624 nucleotide sequence <SEQ ID 152> is: 



40 



45 



50 



55 



1 


ATGGATGCGA 


51 


TTTCGTGGCA 


101 


ATGCAAATTC 


151 


TGTGTTGAAA 


201 


AGGATCCTTA 


251 


AGATTGCTGG 


301 


ACATTAAGTT 


351 


AGGGGAGCGG 


401 


GTATTTGCGA 


451 


GTCATAGAAA 


501 


TGAATTTAAA 


551 


ATTCTTGGGC 


601 


CTTGTCTCAG 


651 


TACTCCTGAA 


701 


CTGAGGGTCT 


751 


GTTGTAGATT 


801 


TACTCTGAAC 


851 


GTCATTTAGT 


901 


GACAATTATT 
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951 ATTAGGAGGG GTGGCTCTTG AATGTCAAAG ATGA 

The PSORT algorithm predicts inner membrane (0.168). 

The protein was expressed in Kcoli and purified as a his-tag product (Figure 76A). The recombinant 
protein was used to immunise mice, whose sera were used in a Western blot (Figure 76B) and for 
FACS analysis. 

The cp6624 protein was also identified in the 2D-PAGE experiment. 

These experiments show that cp6624 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 77 

The following C.pneumoniae protein (pid 4376728) was expressed <SEQ ID 153; cp6728>: 



20 



51 
101 



351 
401 
451 
501 



MKSSVSWLFF SSIPLFSSLS IVAAEVTLDS SMNSYDGSNG TTFTVFSTTD 
AAAGTTYSLL SDVS FONAGA LGIPLASGCF LEAGGDLTFQ GNQHALKFAF 
INAGSSAGTV AST S AADKNI* LFNDFSRLSI ISCPSKLLSP TGQCALKSVG 
i < ^1 NLSLTGNSQI IFTQNFS SDN GGVIOTKNFL LSGTSQFASF SRNOAFTGKQ 

13 201 GGWYATGTI TIENSPGIVS FSQNLAKGSG GALYSTDNCS ITDNFQVIFD 

251 GNSAWEAAQA QGGAICCTTT DKTVTLTGNK NLSFTNNTAL TYGGAISGLK 
301 VSISAGGPTL FQSNISGSSA GQGGGGAINI ASAGELALSA TSGDITFNNN 
QVTNGSTSTR NAINIIDTAK VTSIRAATGQ SIYFYDPITN PGTAASTDTL 
NLNLADANSE IEYGGAIVFS GEKXiSPTEKA IAANVTSTIR QPAVLARGDL 
VLRDGVTVTF KDLTQSPGSR ILMDGGTTLS AKEAWLSLNG LAVNLSSLDG 
TNKAALKTEA ADKNISLSGT IALIDTEGSF YENHNLKSAS TYPLLELTTA 
551 GANGTITLGA LSTLTLQEPE THYGYQGNWQ L> S VJANAT S SK IGSINWTRTG 
601 YIPSPERKSN LPLNSLWGNF IDIRSINQLI ETKSSGEPFE REXjWL SG I AN 
9 <r 651 FFYRDSMPTR HGFRHISGGY ALGXTATTPA EDQLTFAFCQ LF ARDRNH I T 

ZZ> 701 GKNHGDTYGA SLYFHHTEGL FDIANFLWGK ATRAPWVL SE ISQ1IPLSFD 

751 AKFSYLHTDN HMKTYYTDWS IIKGSWRNDA FCADLGASLP FVISVPYLLK 
801 EVEPFVKVQY IYAHQQDFYE RHAEGRAFNK SELIWVEIPI GVTFERDSKS 
851 EKGTYDLTLM YILDAYRRNP KCQTSLIASD ANMAYGTNL ARQGFSVRAA 
901 NHFQVWPHME IFGQFAFEVR SSSRNYNTNL GSKFCF* 

30 The cp6728 nucleotide sequence <SEQ ID 154> is: 

1 ATGAAGTCCT CTGTCTCTTG GTTGTTCTTT TCTTCAATCC CGCTCTTTTC 

51 ATCGCTCTCT ATAGTCGCGG CAGAGGTGAC CTTAGATAGC AGCAATAATA 

101 GCTATGATGG ATCTAACGGA ACTACCTTCA CGGTCTTTTC CACTACGGAC 

151 GCTGCTGCAG GAACTACCTA TTCCTTACTT TCCGACGTAT CCTTTCAAAA 

M 201 TGCAGGGGCT TTAGGAATTC CCTTAGCCTC AGGATGCTTC CTAGAAGCGG 

251 GCGGCGATCT TACTTTCCAA GGAAATCAAC ATGCACTGAA GTTTGCATOT 

301 ATCAATGCGG GCTCTAGCGC TGGAACTGTA GCCAGTACCT CAGCAGCAGA 

351 TAAGAATCTT CTCTTTAATG ATTTTTCTAG ACTCTCTATT ATCTCTTGTC 

401 CCTCTCTTCT TCTCTCTCCT ACTGGACAAT GTGCTTTAAA ATCTGTGGGG 

W 451 AATCTATCTC TAACTGGCAA TTCCCAAATT ATATTTACTC AGAACTTCTC 

501 GTCAGATAAC GGCGGTGTTA TCAATACGAA AAACTTCTTA TTATCAGGGA 

551 CATCTCAGTT TGCGAGCTTT TCGAGAAACC AAGCCTTCAC AGGGAAGCAA 

601 GGCGGTGTAG TTTACGCTAC AGGAACTATA AC TATCGAGA ACAGCCCTGG 

ax 651 GATAGTTTCC TTCTC TC AAA ACCTAGCGAA AGGATCTGGC GGTGCTCTGT 

* D 701 ACAGCACTGA CAACTGTTCG ATTACAGATA ACTTTCAAGT GATCTTTGAC 

751 GGCAATAGTG CTTGGGAAGC CGCTCAAGCT CAGGGCGGGG CTATTTGTTG 

801 CACTACGACA GATAAAACAG TGACTCTTAC TGGGAACAAA AACCTCTCTT 

851 TCACAAATAA TACAGCATTG ACATATGGCG GAGCCATCTC TGGACTCAAG 

~ 901 GTCAGTATTT CCGCTGGAGG TCCTACTCTA TTTCAAAGTA ATATCTCAGG 

M ^ AAGTAGCGCC GGTCAGGGAG GAGGAGGAGC GATCAATATA GCATCTGCTG 

1001 GGGAACTCGC TCTCTCTGCT ACTTCTGGAG ATATTACCTT CAATAACAAC 

1051 CAAGTCACCA ACGGAAGCAC AAGTACAAGA AACGCAATAA ATATCATTGA 

1101 TACCGCTAAA GTCACATCGA TACGAGCTGC TACGGGGCAA TCTATCTATT 

1151 TCTATGATCC CATCACAAAT CCAGGAACCG CAGCTTCTAC CGACACATTG 

33 1201 AACTTAAACT TAGCAGATGC GAACAGTGAG ATCGAGTATG GGGGTGCGAT 

1251 TGTCTTTTCT GGAGAAAAGC TTTCCCCTAC AGAAAAAGCA ATCGCTGCAA 
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1301 ACGTCACCTC TACTA.TCCGA CAACCTGCAG TATTAGCGCG GGGAGATCTT 
1351 GTACTTCGTG ATGGAGTCAC CGTAACTTTC AAGGATCTGA CTCAAAGTCC 
1401 AGGATCCCGC ATCTTAATGG ATGGGGGGAC TACACTTAGT GCTAAAGAGG 
1451 CAAATCTTTC GCTTAATGGC TTAGCAGTAA ATCTCTCCTC TTTAGATGGA 
1501 ACCAACAAGG CAGCTTTAAA AACAGAAGCT GCAGATAAAA ATATCAGCCT 

1551 ATCGGGAACG ATTGCGCTTA TTGACACGGA AGGGTCATTC TATGAGAATC 

1601 ATAACTTAAA AAGTGCTAGT ACCTATCCTC TTCTTGAACT TACCACCGCA 

1651 GGAGCCAACG GAACGATTAC TCTGGGAGCT CTTTCTACCC TGACTCTTCA 

10 1701 AGAACCTGAA ACCCACTACG GGTATCAAGG AAACTGGCAG TTGTCTTGGG 

1U 1751 CAAATGCAAC ATCCTCAAAA ATAGGAAGCA TCAACTGGAC CCGTACAGGA 

1801 TACATTCCTA GTCCTGAGAG AAAAAGTAAT CTCCCTCTAA ATAGCTTATG 

1851 GGGAAACTTT ATAGATATAC GCTCGATCAA TCAGCTTATA GAAACCAAGT 

1901 CCAGTGGGGA GCCTTTTGAG CGTGAGCTAT GGCTTTCAGG AATTGCGAAT 

1S 1951 TTCTTCTATA GAGATTC TAT GCCCACCCGC CATGGTTTCC GCCATATCAG 

lD 2001 CGGGGGTTAT GCACTAGGGA TCACAGCAAC AACTCCTGCC GAGGATCAGC 

2051 TTACTTTTGC CTTCTGCCAG CTCTTTGCTA GAGATCGCAA TCATATTACA 

2101 GGTAAGAACC ACGGAGATAC TTACGGTGCC TCTTTGTATT TCCACCATAC 

2151 AGAAGGGCTC TTCGACATCG CCAATTTCCT CTGGGGAAAA GCAACCCGAG 

?n 2201 CTCCCTGGGT GCTCTCTGAG ATCTCCCAGA TCATTCCTTT ATCGTTCGAT 

ZU 2251 GCTAAATTCA GTTATCTCCA TACAGACAAC CACATGAAGA CATATTATAC 

23 01 CGATAACTCT ATCATCAAGG GTTCTTGGAG AAACGATGCC TTCTGTGCAG 

23 51 ATCTTGGAGC TAGCCTGCCT TTTGTTATTT CCGTTCCGTA TCTTCTGAAA 

24 01 GAAGTCGAAC CTTTTGTCAA AGTACAGTAT ATCTATGCGC ATCAGCAAGA 
9 ^ 2451 CTTCTACGAG CGTCATGCTG AAGGACGCGC TTTCAATAAA AGCGAGCTTA 
Z ° 2501 TCAACGTAGA GATTCCTATA GGCGTCACCT TCGAAAGAGA CTCAAAATCA 

2551 GAAAAGGGAA CTTACGATCT TACTCTTATG TATATACTCG ATGCTTACCG 

2601 ACGCAATCCT AAATGTCAAA CTTCCCTAAT AGCTAGCGAT GCTAACTGGA 

2651 TGGCCTATGG TACCAACCTC GCACGACAAG GTTTTTCTGT TCGTGCTGCG 

a n 2701 AACCATTTCC AAGTGAACCC CCACATGGAA ATCTTCGGTC AATOCGCTTT 

^ U 2751 TGAAGTACGA AGTTCTTCAC GAAATTATAA TACAAACCTA GGCTCTAAGT 

2801 TTTGTTTCTA G 

The PSORT algorithm predicts inner membrane (0.187). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 77A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
35 77B) and for FACS analysis. 

The cp6728 protein was also identified in the 2D-PAGE experiment. 

These experiments show that cp6728 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 78 

40 



45 



50 



The following C.pneumoniae protein (pid 


i 


MFVMKKLVRL 


CWLLSLLPN 


51 


STDILSRSLS 


SYIQSFDPHK 


101 


AIYRNINQLI 


HES IXtRARQW 


151 


LDEVKQRQRA 


IiLLSYLSLHL 


201 


LGINDHGVAM 


DRDEEAYQFH 


251 


EKGMCGIGW 


IiKED IDGVW 


301 


HLSFRGVLDC 


LRGGHGSTW 


351 


EPYGDGVIGK 


VTLHSFYEGE 


401 


NTGGFLSQAI 


KVSGLFMTNG 


451 


LVSKSSASAA 


EIVAQTLQDY 


501 


CFKVTVGKYY 


SPSGKSTQLQ 


551 


CDNVLHDPLT 


DLDTQTRPWF 


601 


SENSNFQAFL 


SQIKSSEKTD 



A predicted signal peptide is highlighted. 
55 The cp6847 nucleotide sequence <SEQ ID 156> is: 
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1 ATGTTCGTAA TGAAAAAACT TGTCCGTCTA TGCGTAGTTC TTCTTTCTTT 
51 ACTTCCGAAT GTATTATTTT CTTCGGATCT TTTACGAGAA GAGGGCATCA 
101 AAAAGATGAT GGACAAGCTG ATCGAGTATC ATGTCGATGC TCAAGAGOTT 
5 ^ * CTACGGATA TACTCTCGCG TTCTTTATCT AGTTACATTC £££££ 

201 TCCTCATAAA TCTTATCTTT CAAACCAAGA GGTTGCAGTT TTTcScAGT 
251 CTCCGGAAAC AAAGAAACGT CTCTTAAAGA ATTATAAGGC AGGCAACTTT 
301 GCTATTTATC GCAACATCAA TCAATTAATT CATGAGAGTA TtcScgTCC 
351 CAGGCAGTGG AGAAACGAAT GGGTTAAGAA TCCAAAAGAG CTTGTATTGG 
10 AGGCATCCTC ATATCAGATA TCGAAGCAAC CTATGCAATG GAGCAAATCT 

W t* 1 TTAGACGAAG TGAAGCAGAG ACAACGCGCT CTACTCCTTT CCTATCTTOC 

501 TTTACATCTT GCTGGAGCTT CTTCCTCTCG TTATGAGGGT AAAoSqc 
551 AGCTTGCTGC TCTGTGTCTA CGTCAAATCG AGAACCATGA GAATGTATAT 
601 TTAGGTATCA ACGATCATGG TGTTGCTATG GATCGGGATG AAGAAGCCTA 
15 "J CCAATTCCAT ATCCGTGTTG TTAAAGCTTT AGCTCATAGC TTAGATGCAC 

° '"J ATACGGCGTA TTTCAGTAAG GACGAAGCGT TGGCGATGCG AATCCAACTA 

751 GAAAAAGGCA TGTGTGGAAT TGGTGTTGTT CTGAAGGAAG ATATTGATGG 
801 AGTTGTTGTT AGAGAAATCA TTCCTGGGGG ACCTGCGGCT AAATCTGGGG 
851 ATCTTCAGCT TGGAGATATC ATCTATCGGG TGGATGGCAA GGATATCGAG 
20 «?■ ™n TCTT TCCGCGGTOT TTTAGATOGT TTACGTGGAG GTCATGGCTC 

TACTGTAGTC TTAGATATCC ATCGTGGGGA GAGCGATCAT ACGATCGCCT 
1001 TGAGAAGGGA GAAAATCCTT TTAGAAGACC GTCGTGTGGA TGTTTCCTAT 
1051 GAGCCTTATG GAGATGGTGT GATTGGGAAA GTTACGTTAC ATTCTTTTTA 
S^SSS?* AATCAGGTTT CTAGTGAACA AGATCTACGT CGAGCGATTC 
0<? 1151 AGGGATTAAA GGAGAAGAAC CTTCTTGGAT TAGTTTTAGA TATCrTAPM 

AATACGGGTG GATTTTTATC TCAAGCGATC AAAGTTTCTG « TA ™ 
1251 GACCAATGGC GTTGTGGTTG TATCTCGCTA TGCTGATGGT ACCATGAAGT 
1301 GCTACCGCAC AGTATCTCCT AAAAAATTCT ATGATGGTCC 
llni " A f TATCTA AAAGTTCCGC ATCAGCAGCG GAGATTGTAG C^A^CTCT 
30 Al\ CCAAGATTAT GGAGTTGCTT TAGTTGTTGG AGATGAGCAG ACCTMGGGA 

1451 AGGGAACGAT TCAGCATCAA ACAATTACTG GAGATGCCTC TCAGGACGAT 
\£ ^™ AAGG TTACTGTAGG GAAATATTAT TCCCCTTCTG GGAAaSc 
1551 TCAACTTCAG GGAGTAAAAT CCGATATTTT AATTCCTTCT CTCTATGCTG 
\£ A ^ ATCGTCT AGGAGAGCG * TTTCTAGAGC ATCCCTTACC SIS 
35 TGTGA TAATG TACTTCACGA TCCTCTCACG GACTTGGATA CTCAAACACG 

35 ""J" TCCTTGGTTT CAAAAATACT ATCTTCCTAA TCTACAAAAG CAAGAGACTC 

1751 TTTGGAGAGA GATGCTACCT CAGCTTACGA AAAACAGTGA GCAA^gScOT 
1801 TCTGAGAATT CGAATTTTCA GGCATTTTTG TCGCAGATAA AATCATCTGA 
1851 AAAAACGGAC CTATCCTATG GTTCCAATGA TTTACAATTG GAAGAGTCGA 
1901 TAAACATTTT GAAGGACATG ATTTTATTAC AACAGTOtIg A^tS 

40 The PSORT algorithm predicts periplasmic (0.932). 

The protein was expressed in Ecoli and purified as a GST-fusion product (Figure 78A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
78B) and for FACS analysis. 

These experiments show that C p6847 is a surface-exposed and immunoaccessible protein, and that it 
45 is a useful immunogen. These properties are not evident from the sequence alone. 

Example 79 

The following C.pneumoniae protein (pid 4376969) was expressed <SEQ ID 157; c P 6969>: 

1 ISr f™ T LFFSLA ^SC CGYSILNSPY HLS SLGKSLD QERIFIAP1K 

1 2£f ™ CSA LTYELSK RSF AISGRSSCAG YTLKVELLNG IDKtJIGFTYA 

1 SS f™™*** SLSAKVQLIN NDTQEVLIDQ CVARESVDFD 

1 FEPDLGTANA HEFALGQPEM HSEAIKSARR ILSIRLAETI AQQVYYDLF* 

A predicted signal peptide is highlighted. 

The cp6969 nucleotide sequence <SEQ ID 158> is: 

55 si tttctttagg CACGATTTAT CTTTTTTTTT CTCTAGCACT 

^ ± ll CTtoIgo™ r TGG<raACT CTATTTTAAA CAGCCCGTAT CACTTATCGT 

101 CTTTAGGTAA GTCTTTATTA CAGGAAAGAA TTTTCATTGC TCCCATAAAA 
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151 GAAGATCCTC ATGGTCAGCT CTGCTCAGCT CTAACTTATG AGCTTAGTAA 

201 GCGTTCTTTT GCTATCTCTG GAAGGAGTTC TTGCGCAGGC TATACTCTTA 

251 AAGTAGAGCT TCTGAATGGT ATTGACAAGA ATATAGGTTT TACGTATGCC 

3 01 CCAAATAAAC TCGGAGATAA GACTCACAGG CATTTTATAG TCTCTAATGA 

•5 351 AGGCAGACTA TCACTATCTG CAAAAGTACA GCTTATCAAT AATGACACTC 

401 AAGAAGTCCT TATAGACCAA TGTGTTGCTC GAGAGTCTGT AGACTTTGAC 

451 TTTGAGCCTG ACTTAGGAAC AGCAAACGCT CATGAATTTG CTTTAGGCCA 

501 ATTTGAAATG CATAGTGAAG CCATAAAAAG TGCTCGCCGT ATACTATCTA 

551 TACGCCTAGC CGAGACGATT GCTCAACAGG TATACTATGA CCTTTTTTGA 

10 The PSORT algorithm predicts inner membrane (0. 126). 

The protein was expressed in Rcoli and purified as a GST-fusion product (Figure 79A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
79B) and for FACS analysis. 

These experiments show that cp6969 is a surface-exposed and immunoaccessible protein, and that it 
15 is a useful immunogen. These properties are not evident from the sequence alone. 



Example 80 

The following ^pneumoniae protein (pid 4377109) was expressed <SEQ ED 159; cp7109>: 

1 MKKTCCQN YR SIGWFSWL FVXjTTQTLFA GHFIDIGTSG IiYSWARGVSG 

on 51 DGRVWGYEG GNAFKYVDGE KKDLEGLVPR SEALVFKASY DGSVIIGISD 

ZU 101 QDPSCRAVKW VNGALVDLGI. . FSBGMQSFAE GVSSDGKTIV GCLYSDDTET 

151 NFAVKWDETG MWLPNLPED RHSCAWDASE DGSVIVGDAM GSEEIAKAVY 

201 WKDGEQHLLS NIPGAKRSSA havskdgsfi vgefiseene vhafvyhngv 

251 IKDIGTLGGD YSVATGVSRD GKV1VGHSTR TDGEYRAFKY VDGRMIDLGT 

3 01 LGGSASFAFG VSDDGKTIVG KFETELGECH AFIYLDD* 

25 A predicted signal peptide is highlighted. 



The cp7109 nucleotide sequence <SEQ ID 160> is: 

1 ATGAAAAAGA CATGTTGCCA AAATTACAGA TCGATAGGCG TTGTGTTCTC 

51 TGTGGTACTT TTCGTTCTTA CAACACAGAC GCTGTTTGCA GGACATTTTA 

101 TTGATATTGG AACTTCTGGA TTATATTCTT GGGCTCGAGG TGTATCTGGA 

3v 151 GATGGCCGCG TTGTCGTAGG TTATGAAGGT GGCAATGCAT TTAAATATGT 

201 TGATGGTGAG AAATTTCTGT TAGAAGGTTT GGTCCCGAGA TCCGAGGCCT 

251 TGGTATTTAA AGCTTCTTAT GATGGCTCTG TAATTATAGG AATCTCGGAT 

301 CAAGATCCGT CTTGCCGCGC TGTGAAGTGG GTAAACGGTG CACTTGTTGA 

35i TCTTGGAATA TTTTCTGAGG GAATGCAATC TTTTGCAGAG GGTGTTTCCA 

35 401 GTGATGGAAA GACGATTGTA GGGTGCCTAT ATAGTGATGA TACAGAGACA 

451 AACTTTGCTG TGAAGTGGGA TGAAACAGGA ATGGTTGTTC TCCCTAACTT 

501 ACCAGAAGAT CGACATTCTT GCGCTTGGGA TGCCTCTGAA GATGGCTCTG 

551 TGATTGTAGG GGACGCCATG GGTAGGGAGG AAATTGCCAA GGCAGTGTAC 

601 TGGAAGGACG GTGAACAACA TCTGCTTTCT AATATCCCAG GAGCTAAAAG 

4U 651 ATCGTCAGCA CATGCAGTTT CTAAAGATGG ATCTTTTATC GTAGGCGAGT 

701 TCATCAGTGA AGAAAATGAA GTTCATGCCT TTGTTTATCA CAACGGTGTT 

751 ATCAAAGATA TCGGGACTTT AGGAGGAGAT TAC TCTGT AG CAACTGGAGT 

801 TTCTAGGGAT GGTAAGGTCA TCGTGGGTCA TTCTACAAGA ACAGATGGTG 

851 AATAC CGTGC ATTTAAATAT GTGGATGGAA GAATGATAGA TTTGGGGACT 

45 901 TTAGGAGGTT CAGCATCTTT TGCTTTTGGT GTTTCTGACG ATGGCAAAAC 

951 AATCGTAGGA AAATTTGAAA CAGAGCTAGG AGAATGTCAT GCCTTTATCT 

1001 ACCTTGATGA TTAG 

The PSORT algorithm predicts outer membrane (0.887). 

The protein was expressed in Rcoli and purified as a GST-fusion product (Figure 80A). The 
50 recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
80B) and for FACS analysis. 
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These experiments show that cp7109 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 81 

The following Cpneumomae protein (pid 4377110) was expressed <SEQ ID 161; cp71 10>: 

J- BgAIKQILRS HLSQSSLWMV LFSLYSLSGY CYVITDKPED DFHSSSAVKW 
51 DHWGKTTLSR LSNKKASAKA VSGTGATTVG FIKDTWSRTY AVRWNYWGTK 



101 ELPTSSWVKK SKATGISSDG SIIAGIVENE LSQSFAVTWK NNEMYLLPST 

151 WAVQSKAYGI SSDGSVIVGS AKDAWSRTFA VKWTGHEAQV LPVGWAVKSV 

10 ANSVSANGSI ^VGSVQDASG ILYAVKWEGN TITHLGTLGG YSAIAKAVSN 

10 ^ ^KVIVGRSE ^VGEVHAFC HKNGVMSDLG TIGGSYSAAK GVSA^X^ 

3 01 GMSTTANGKL HAFKYVGGRM IDLGEYSWKE ACANAVSIDG EIIVGVQSE 

A predicted signal peptide is highlighted 
The cp71 10 nucleotide sequence <SEQ ID 162> is: 

IS * AT GGCAGCTA TAAAACAAAT TTTACGTTCT ATGCTATCTC AGAGTAGCTT 

51 ATGGATGGTC CTATTTTCAT TATATTCTCT ATCTGGTTAT TGCTATGTAA 

101 TTACAGACAA ACCAGAAGAT GACTTCCATT CTTCATCCGC AGTAAAATGG 

151 GATCATTGGG GAAAGACAAC TCTCTCAAGA TTATCAAATA AAAAAGCCTC 

IV; I^™ GCT GTTTCAGGAA CTGGTGCTAC AACTGTCGGC TTTATAAAAG 

20 It] ACACTTGGTC TCGAACATAC GCAGTAAGAT GGAATTATTG GGGGACCAAA 

U ^01 GAACTCCCTA CCAGCTCATG GGTAAAAAAA TCAAAAGCAA CAGGAATCTC 

3 51 CTCTGATGGG TCTATAATCG CGGGGATTGT CGAGAATGAG CTTTCTCAAA 

401 GTTTCGCAGT CACATGGAAA AACAATGAAA TGTATTTGCT CCCTTCCACA 

451 TGGGCAGTGC AATCTAAAGC GTATGGAATT TCTTCTGATG GCTCTGTTAT 
TGTAGGGAGT GCTAAGGATG CTTGGTCGCG AACTTTCGCT GTGAAGTGGA 
CGGGACACGA GGCTCAGGTG TTACCAGTAG GCTGGGCTGT CAAATCTGTA 

601 GCGAATTCTG TATCTGCCAA TGGATCTATA ATTGTAGGGT CTGTACAAGA 



CGCCTCTGGA ATTCTTTATG CTGTAAAGTG GGAAGGGAAC ACTATTACAC 



25 501 TGTAGGGAGT GCTAAGGATG CTTGGTCGCG AACTTTCGCT GTGAAGTGGA 

601 

651 A vi A ITA^ 

701 ATCTAGGAAC TTTAGGAGGC TATTCTGCCA TTGCAAAAGC TGTATCCAAT 

o 0 ^ AATGGCAAG G TCATTGTAGG GAGATCCGAA ACATATTATG GAGAGGTCCA 

S01 TGCTTTCTGT CATAAGAATG GCGTCATGTC AGACCTCGGC ACCCTCGGAG 

851 GATCTTATTC TGCAGCTAAG GGAGTCTCTG CAACTGGAAA AGTTATTGTC 

901 GGTATGTCCA CAACAGCAAA TGGGAAATTG CATGCCTTTA AATATGTCGG 

951 TGGAAGAATG ATCGACTTAG GAGAGTATAG CTGGAAAGAA GCCTGTGCAA 

1001 ACGCTGTTTC TATTGATGGA GAAATTATTG TTGGAGTCCA ATCAGAATAA 

35 The PSORT algorithm predicts outer membrane (0.827). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 81A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
8 IB) and for FACS analysis. 

These experiments show that cp71 10 is a surface-exposed and immunoaccessible protein, and that it 
40 is a useful immunogen. These properties are not evident from the sequence alone. 

Figure 191 shows a schematic representation of the structural relationships between of c P 7105 
cp7106, c P 7107, c P 7108, cp7109 and cp7110, each of which is identified herein. These six proteins 
may be grouped in a new family of related outer membrane-associated proteins. These proteins have 
a repeat structure in common (cf. the pmp family). 

45 Example 82 

The following ^pneumoniae protein (pid 4377127) was expressed <SEQ ID 163; cp7l27>: 

1 MVFFRMSIttH LVALSGMLCC SSGVALTIAE KMASLEHSGR GADDYEGMAS 
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10 



15 



51 


PNANMREYSL 


101 


EELWAAEIRE 


151 


IKIATLSKFV 


201 


VAGVFSSRKD 


251 


IAGRVWIFGS 


301 


NAAFREDLTK 


351 


EEGIENPTDK 


401 


GSQLNASIQI 


451 


MLLKKLDVPK 


501 


SWAGGTGILE 


551 


QTPARIAWD 


601 


ITLETDITFD 


651 


DSHDGIPFLG 


701 


EEALLSSRPG 


751 


YDGC* 



QLSKLYEEAR 
KGGNLEDYAIt 
VPKESFEDCL 
LiEALPETAYI 
AGEVGELLKI 
DVSEESLGLR 
TVFWYNVKHS 
DTTVSSSAKD 
KMVRIEVLLF 
FLFKGSTGSS 
EMSIAVSSDK 
TTGKNHDDRP 
DlPGIGKLrFG 
EREEYYQALA 
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KLRASGTEDE 
WNHPETTIYN 
TQILSRLGXG 
GFVLNSNVDA 
YNFVQSESIR 
WPLQYQGRS 
DPQELAALLS 
GSVKYGNFIA 
ERKIiAHEQKS 
IVPGYDLAYQ 
DKAQYNRAQY 
DVTRRNITNK 
MSSTSDSLTE 
ASEAAARAAH 



ALWKDLIRRI 
LVTDYGTEDS 
VRQVNSWIKE 
HTNQHVLKKF 
QEYRVIPLTK 
LFLSGTAALV 
QVHDVFSGEN 
DSKTGTLIMV 
GIjNIiLRLGEE 
FLMAQEDVRI 
GIMIKMLPVI 
VRIADGETVI 
MFVFITPKIL 
KKLEMFPASG 



GEVRGYLREI 
IYLXPQEIGA 
LYMMRKEGCS 
INPETTHVDV 
IDPGEMIS1L 
QQALTLIREL 
KASVGAADGC 
VEKEVLPRIQ 
VCKKGCSPSV 
NASPSWTMN 
NVGEEDGKSY 
IGGLRCKQMS 
ENPVEQQERK 
VSLSQVERQE 



A predicted signal peptide is highlighted. 

The cp7127 nucleotide sequence <SEQ ID 164> is: 



20 



25 



30 



35 



40 



45 



50 



55 



60 



l 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 



ATGGTTTTTT 

GCTCTGTTGT 

CTTTAGAGCA 

TTTAATGCCA 

GGAAGCACGA 

AGGACTTAAT 

GAGGAGCTTT 

CTACGCCCTC 

ATTACGGAAC 

ATTAAAATCG 

AGACTGTCTC 

TCAATTCTTG 

GTTGCTGGAG 

AGCCTATATT 

AACATGTCTT 

ATTGCAGGAC 

TCTGAAGATT 

GGGTGATTCC 

AACGCAGCAT 

AGGCCTTCGT 

GTGGAACCGC 

GAAGAAGGGA 

CAAGCACTCC 

ATGTCTTCTC 

GGGTCGCAAT 

TGCGAAAGAT 

CAGGAACTCT 

ATGCTACFTA 

GCTGTTATTT 

TTCTACGTCT 

TCTTGGGCCG 

GGGATCTTCG 

CTCAAGAGGA 

CAAACCCCAG 

TTCAGATAAA 

TAAAAATGCT 

ATTACTTTAG 

TGATCGTCCT 

CTGACGGAGA 

GATTCTCATG 

GMATTTGGA 

TTATCACTCC 

GAAGAAGCTT 

GGCTTTAGCA 

AGATGTTCCC 

TACGATGGCT 



TCCGTAATTC 
TCTTCTGGAG 
CTCGGGGAGA 
ATA1GAGGGA 
AAGCTACGCG 
TCGACGGATT 
GGGCTGCAGA 
TGGAATCACC 
CGAAGACTCT 
CAACCTTATC 
ACTCAGATCC 
GATTAAGGAA 
TTTTTOCCTC 
GGTTTTGTAT 
AAAAAAGTTC 
GTGTGTGGAT 
TATAATTTTG 
CTTAACTAAG 
TTCGTGAGGA 
GTAGTTCCTT 
GGCGTTAGTC 
TTGAGAACCC 
GATCCCCAAG 
TGGCGAGAAT 
TAAATGCCTC 
GGCTCAGTGA 
GATTATGGTG 
AGAAACTAGA 
GAAAGAAAAT 
TGGTGAGGAA 
GGGGTACTGG 
ATAGTTCCTG 
CGTTCGGATT 
CACGGATTGC 
GATAAAGCGC 
CCCCGTAATT 
AGACAGACAT 
GATGTTACAA 
GACTGTGATT 
ATGGCATTCC 
ATGAGTTCCA 
GAAGATCCTA 
TACTCTCTTC 
GCTAGTGAGG 
GGCATCAGGA 
GCTAG 

The PSORT algorithm predicts periplasmic (0.920). 



TTTACTGCAT 
TGGCTTTAAC 
GGAGCAGACG 
GTATAGCCTT 
CTTCTGGAAC 
GGTGAGGTGC 
AATTCGTGAG 
CAGAGACTAC 
ATTTATTTGA 
GAAATTTGTA 
TATCTCGCTT 
CTTTATATGA 
CAGAAAAGAT 
TGAATTCGAA 
ATTAACCCTG 
TTTTGGTTCT 
TGCAGTCGGA 
ATCGATCCAG 
TCTGACTAAA 
TACAGTATCA 
CAGCAAGCGC 
TACGGATAAA 
AGTTGGCGGC 
AAGGCGAGTG 
GATCCAAATT 
AGTACGGAAA 
GTTGAGAAAG 
TGTCCCTAAA 
TGGCACATGA 
GTTTGTAAAA 
CATACTAGAA 
GTTATGATCT 
AATGCGAGTC 
TGTTGTTGAT 
AATACAATCG 
AATGTGGGAG 
CACCTTTGAT 
GGCGTAATAT 
ATTGGAGGTT 
TTTCCTTGGA 
CATCAGACAG 
GAAAATCCTG 
GCGCCCTGGA 
CTGCAGCACG 
GTATCTTTAT 



TTAGTTGCCC 
GATAGCCGAG 
ATTATGAGGG 
CAGCTGAGCA 
TGAGGATGAA 
GAGGCTATCT 
AAAGGGGGCA 
GATTTACAAT 
TTCCTCAAGA 
GTTCCTAAAG 
AGGTATTGGC 
TGCGTAAGGA 
TTAGAGGCGC 
CGTAGATGCG 
AAACAACGCA 
GCGGGGGAAG 
GAGCATACGT 
GGGAGATGAT 
GATGTTAGTG 
AGGGCGTTCG 
TGACTCTCAT 
ACAGTATTTT 
ATTGCTTTCC 
TCGGAGCTGC 
GATACTACAG 
CTTCATCGCG 
AAGTTCTTCC 
AAGATGGTCC 
GCAGAAATCT 
AAGGGTGCAG 
TTTTTATTTA 
CGCCTATCAA 
CTTCTGTAGT 
GAAATGTCAA 
TGCGCAGTAC 
AGGAAGACGG 
ACTACGGGAA 
TACTAATAAG 
TGCGTTGCAA 
GACATTCCTG 
TCTCACGGAG 
TAGAGCAACA 
GAGAGAGAAG 
AGCAGCTCAT 
CTCAGGTAGA 



TATCCGGAAT 
AAGATGGCTT 
GATGGCTTCG 
AGTTGTATGA 
GCTCTGTGGA 
TCGAGAGATC 
ATCTCGAGGA 
CTTGTTACCG 
AATCGGAGCG 
AGTCTTTCGA 
GTGCGTCAGG 
GGGCTGCAGT 
TCCCAGAAAC 
CATACCAATC 
TGTAGATGTG 
TCGGCGAGCT 
CAAGAGTATC 
TTCCATTCTC 
AAGAATCOTT 
TTGTTTTTAA 
TCGAGAGCTT 
GGTATAACGT 
CAAGTCCATG 
AGATGGATGT 
TAAGTTCTTC 
GATTCTAAGA 
ACGTATTCAG 
GTATCGAGGT 
GGGTTAAATC 
TCCTTCTGTG 
AAGGAAGTAC 
TTTTTAATGG 
TACTATGAAC 
TAGCGGTGTC 
GGTATCATGA 
AAAAAGTTAC 
AAAATCATGA 
GTGCGCATTG 
ACAGATGTCA 
GTATAGGGAA 
ATGTTTGTAT 
AGAACGTAAA 
AATACTATCA 
AAAAAATTAG 
GAGGCAAGAA 
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The protein was expressed in Exoli and purified as a GST-fusion product (Figure 82A) and also in 
his-tagged form. The recombinant proteins were used to immunise mice, whose sera were used in a 
Western blot (Figure 82B) and for FACS analysis. 

These experiments show that c P 7127 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 83 

The following Cpneumoniae protein (pid 4377133) was expressed <SEQ ID 165; cp7133> : 

MQPFIFTLLC IiT SLVSLVAF DAANARKRCA CAQTIERGEN FFSIKRSACA 
EIEYQEKSRH ASAIERISKD KGKVTPKQIA KVATKKKQRY RLLQVPPSRP 
PNNSRYWLYA LLSEPPECYS DTASWYAIFI RLLRRAYVDT GNVPPGSEYA 
IANALISNKQ EILERGAQLG PDVIETLTLP EEQAEIFYKM LKGSSNSOSL 
IiNFLHYEEKS LGHCKLNLIF MDPLLLEAVL DHPDAYRETS LLRDGIWEAV 
KRQEHAIQEH GQAAALELFK TRTDFRtiELR DKMQLLLSRY DLLPLLNKKM 
301 FDYTLGSAGD YLFLVBPDTK AISRCRCPSK SIKL 

A predicted signal peptide is highlighted. 

The cp7133 nucleotide sequence <SEQ ID 166> is: 

1 ATGCAACCTT TTATCTOTAC TTTACTGTGC TTGACATCTT TGGTTTCTTT 
51 AGTCGCCTTT GATGCTGCGA ATGCTCGTAA ACGTTGTGCC TGTGCTCAAA 
101 CTATAGAACG TGGAGAGAAC TTCTTTTCCA TAAAACGCTC TGCTOGTGCT 
151 GAAATCGAAT ATCAAGAAAA ATCTCGCCAC GCCTCAGCAA TTGAAAGAAT 
201 CTCAAAAGAT AAAGGCAAAG TCACTCCAAA GCAGATTGCG AAAGTAGCTA 
251 CTAAGAAAAA GCAAAGATAC CGTTTATTGC AGGTTCCTTT TTCAAGGCCT 
301 CCGAATAACT CAAGGTATAA CCTCTATGCT TTGCTTAGTG AACCTCCCGA 
351 ATGCTATAGC GATACAGCAT CATGGTATGC TATTTTTATT CGGTTACTTC 
401 GACGTGCTTA TGTAGACACG GGAAATGTAC CTCCTGGATC TGAGTATGCC 
451 ATCGCTAATG CTTTGATAAG TAACAAACAA GAGATTTTAG AGAGGGGAGC 
501 GCAGCTTGGA CCCGATGTTA TTGAAACTCT AACATTGCCT GAGGAACAAG 
551 CCGAGATTTT TTATAAAATG CTCAAAGGGT CGTCAAACTC TCAGTCGCTA 
601 CTGAATTTTC TGCATTATGA AGAGAAAAGC TTAGGCCACT GTAAGCTAAA 
651 TCTGATCTTC ATGGATCCCC TACTGTTAGA AGCTGTTCTA GATCATCCCG 
701 ATGCTTATAG GGAAACGTCG CTCCTGCGCG ATGGCATTTG GGAAGCGGTG 
751 AAGCGTCAAG AACATGCCAT CCAAGAACAT GGCCAGGCAG CTGCTTTGGA 
801 GCTTTTTAAA ACACGCACCG ACTTCCGCCT GGAGCTGCGA GATAAGATCC 
AGTTACTOCT AAGTCGATAC GATTTGCTCC CCTTATTAAA TAAAAAAATG 
901 TTCGACTACA CCTTAGGAAG TGC CGGAG AT TACTTATTTT TGGTAGACCC 
951 AGATACTAAG GCAATTTCTC GATGTCGCTC CCCTTCAAAG AGTATTAAAT 
iOOl TATAA 

The PSORT algorithm predicts outer membrane (0.92). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 83A) and also in 
his-tagged form. The recombinant proteins were used to immunise mice, whose sera were used in a 
Western blot (Figure 83B) and for FACS analysis. 

These experiments show that cp7133 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 
Example 84 



The following Cpneumoniae protein (pid 4377222) was expressed <SEQ ID 167; cp7222>: 

c I MMRRDMVITA WVMAILLVA LF VT SKRIGV KDYDEftFRNF ASSKVTQAW 
51 SEEKVIEKPV VAEVPSRPIA KETLAAQFIE SKPVIVTTPP VPWSETPEV 
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101 PTVAVPPQPV RETVKEEQAP YATWVKKGD FLERIARANH TTVAKLMQIN 
151 DLTTTQLKIG QVIKVPTSQD VSNEKTPQTQ TANPENYYIV QEGDSPWTIA 
201 LRNHIRLDDL LKMNDLDEYK ARRLKPGDQL RIR* 

A predicted signal peptide is highlighted. 

The cp7222 nucleotide sequence <SEQ ID 168> is: 

1 ATGAATCGTA GAGACATGGT AATAACAGCT GTCGTAGTGA ATGCTATATT 
51 GCTTGTGGCT CTTTTCGTCA CATCAAAGCG TATTGGCGTC AAGGACTATG 
101 ACGAGGGATT CCGTAATTTT GCTTCTAGCA AGGTTACACA AGCAGTAGTT 
151 TCAGAAGAAA AAGTCATAGA AAAGCCTGTA GTCGCAGAAG TGCC TAGCCG 
2 01 TCCTATCGCT AAAGAGACTC TAGCTGCACA GTTTATTGAA AGTAAGCCGG 
251 TTATTGTAAC CACACCACCC GTGCCTGTTG TTAGCGAAAC CCCAGAAGTG 
301 CCTACTGTGG CAGTTCCGCC TCAGCCTGTT CGTGAGACAG TAAAAGAGGA 
351 ACAAGCTCCT TATGCTACTG TTGTAGTGAA AAAAGGAGAT TTTCTCGAAC 
401 GCATTGCGAG AGCAAATCAT ACTACCGTTG CAAAATTGAT GCAGATCAAT 
451 GATCTTACCA CCACCCAACT TAAAATTGGT CAGGTCATCA AAGTCCCTAC 
501 GTCTCAAGAT GTCAGCAACG AAAAAACTCC TCAAACACAG ACCGCAAACC 
551 CTGAAAATTA TTATATCGTC CAAGAAGGGG ATAGCCCGTG GACAATAGCA 
601 TTGCGTAACC ATATTCGATT GGATGATTTG CTAAAAATGA ATGATCTCGA 
651 TGAATATAAA GCCCGGCGCC TTAAGCCTGG AGATCAGTTG CGCATACGTT 
701 GA 

The PSORT algorithm predicts periplasmic (0.935). 

The protein was expressed in Rcoli and purified as a GST-fusion product (Figure 84A) and also in 
his~tagged form. The recombinant proteins were used to immunise mice, whose sera were used in a 
Western blot (Figure 84B) and for FACS analysis. 

These experiments show that cp7222 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 85 

The following Cpneumomae protein (pid 4377225) was expressed <SEQ ID 169; cp7225>: 

1 MKGTPQYHFI GIGGIGMSAL AHILLDRGYE VSGSDLYESY TIESLKAKGA 

51 RCFSGHDSSH VPHDAWVYS SSIAPDNVEY LTAIQRSSRL UiRAELLSQI, 

101 MEGYESILVS GSHGKTGTSS LIRAIFQEAQ KDPSYAIGGL AANCLNGYSG 

151 SSKIFVAEAD ESDGSLKHYT PRAWITNID NEHLNJJYAGN LDNLVQVIQD 

201 FSRKVTDLNK VFYNGDCPIL KGNVQGISYG YSPECQLHIV SYNQKAWQSH 

251 FSFTFIiGQEY QDIELNLPGQ HNAANAAAAC GVALTFG ID I NIJRKALKKF 

301 SGVHRRLERK NISESFLFLE DYAHHPVEVA HTLRSVRDAV GLRRVIAIFQ 

351 PHRFSRLEEC LQTFPKAFQE ADEVTI/TDVY SAGESPRES1 ILSDLAEQIR 

401 KSSYVHCCYV PHGDIVDYLR NYIRIHDVCV SLGAGNIYTI GEALKDFNPK 

451 KLSIGLVCGG KSCEHDISLL SAQHVSKYIS PEFYDVSYF1 INRQGLWRTG 

501 KDFPHLIEET QGDSPLSSEI ASALAKVDCIj FPVLHGPFGE DGTIQGFFEI 

551 LGKPYAGPSL SLAATAMDKL LTKRIASAVG VPWPYQPLN LCFWKRNPEL 

601 CIQNLIETFS FPMIVKTAHL GSSIGIFLVR DKEELQEKIS EAFLYDTDVF 

651 VEESRliGSRE IEVSCIGHSS SWYCMAGPME RCGASGFIDY QEKYGFDGID 

701 CAKISFDLQL SQESLDCVRE LAERVYRAMQ GKGSARI DFF LDEEGNYWLS 

751 EVNPIPGMTA ASPFLQAFVH AGWTQEQIVD HFIIDALHKF DKQOT1EQAF 

801 TKEQDLVKR* 

The cp7225 nucleotide sequence <SEQ ID 170> is: 

1 ATGAAGGGAA CTCCTCAGTA TCATTTTATC GGTATCGGTG GTATAGGAAT 

51 GAGCGCTTTA GCTCATATTT TGCTTGATCG TGGCTATGAG GTCTCTGGAA 

101 GCGACTTATA TGAAAGCTAT ACGATCGAAA GCCTGAAAGC TAAAGGTGCG 

151 AGGTGTTTCT CAGGCCATGA TTCCTCCCAT GTTCCTCATG ATGCCGTCGT 

201 TGTTTATAGC TCAAGTATAG CCCCTGATAA TGTAGAGTAT CTTACCGCTA 

251 TTCAAAGATC ATCACGTCTT CTTCATAGAG CAGAGCTCTT GAGTCAGCTT 

301 ATGGAGGGTT ATGAAAGCAT TCTGGTTTCA GGAAGCCATG GGAAGACAGG 

351 GACCTCATCT CTAATTCGAG CGATTTTCCA GGAAGCTCAG AAAGATCCCT 
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401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 



CCTATGCTAT 
TCATCGAAAA 
GCACTACACT 
TGAATAATTA 
TTCTCTAGAA 
TCCTATTTTG 
AATGTCAATT 
TTTTCCTTTA 
CCCTGGACAA 
TTACCTTTGG 
TCGGGAGTTC 
TTTCTTAGAA 
GCTCTGTGCG 
CCACATCGAT 
TTTCCAAGAA 
AAAGTCCTAG 
AAGTCTTCTT 
TTATCTACGA 
CTGGAAATAT 
AAATTATCCA 
TTCTCTACTT 
ATGATGTGAG 
AAGGATTTTC 
TTCTGAAATC 
TCCATGGCCC 
TTAGGAAAAC 
GGATAAGCTG 
TCCCTTACCA 
TGTATTCAGA 
TGCACATTOG 
AATTACAAGA 
GTGGAGGAAA 
CCATTCTTCT 
CTAGTGGGTT 
TGCGCAAAGA 
TGTTAGAGAA 
CAGCTCGAAT 
GAGGTCAATC 
TTTTGTTCAC 
TAGATGCTCT 
ACTAAAGAAC 



TGGAGGACTC 
TCTTCGTTGC 
CCCCGTGCAG 
CGCTGGGAAT 
AAGTAACAGA 
AAAGGAAATG 
GGATATCGTT 
CCTTTTTAGG 
CATAACGCTG 
CATAGACATA 
ATCGACGTCT 
GATTATGCTC 
TGATGCTGTG 
TCTCTCGOTT 
GCTGATGAAG 
AGAGTCTATC 
ATGTCCATTG 
AACTACATTC 
CTATACTATT 
TAGGACTCGT 
TCTGCTCAAC 
TTACTTCATC 
CTCATCTTAT 
GCTTCAGCTT 
ATTTGGAGAG 
CTTATGCCGG 
TTAACAAAAC 
ACCTTTAAAT 
ATCTTATAGA 
GGATCTAGTA 
AAAGATCTCA 
GTCGCTTAGG 
AGCTGGTATT 
TATTGATTAT 
TCTCTTTTGA 
CTTGCAGAGC 
AGATTTTTTC 
CTATTCCAGG 
GCAGGATGGA 
ACATAAGTTT 
AAGATTTAGT 



GCTGCAAACT 
CGAAGCCGAT 
TAGTCATTAC 
CTTGATAACC 
TCTCAATAAG 
TCCAAGGGAT 
TCCTATAATC 
CCAGGAGTAT 
CAAATGCAGC 
AACATCATTC 
AGAAAGAAAA 
ATCATCCTGT 
GGTTTGCGAA 
AGAAGAGTGC 
TCATACTTAC 
ATTCTTTCCG 
TTGTTATGTT 
GCATTCATGA 
GGAGAGGCTT 
CTGTGGAGGG 
ATGTCTCTAA 
ATAAATCGTC 
TGAAGAGACT 
TAGCAAAAGT 
GATGGTACGA 
ACCCTCACTA 
GAATTGCATC 
CTCTGTTTCT 
GACATTTTCT 
TTGGGATATT 
GAAGCATTTC 
GTCTCGTGAA 
GTATGGCAGG 
CAAGAGAAAT 
TTTACAGCTC 
GTGTCTACCG 
TTGGATGAAG 
AATGACAGCA 
CGCAAGAACA 
GATAAGCAGC 
TAAAAGATAA 



GCCTGAATGG 
GAAAGTGATG 
AAATATAGAT 
TGGTTCAGGT 
GTATTOTATA 
TTCTTATGGA 
AAAAGGCATG 
CAAGACATTG 
AGCAGCCTGT 
GAAAAGCTCT 
AATATATCCG 
AGAGGTTGCA 
GAGTCATCGC 
TTACAAACCT 
AGATGTCTAT 
ACCTTGCGGA 
CCCCATGGAG 
TGTCTGTGTT 
TAAAAGACTT 
AAATCTTGCG 
ATATATTTCT 
AGGGCTTATG 
CAAGGGGATT 
CGACTGTTTG 
TCCAGGGATT 
TCTTTAGCAG 
AGCAGTGGGT 
GGAAACGCAA 
TTCCCTATGA 
TTTAGTCCGT 
TATATGACAC 
ATCGAAGTGT 
GCCTAATGAA 
ATGGATTTGA 
TCACAAGAAT 
AGCAATGCAA 
AGGGGAATTA 
GCTAGCCCAT 
AATTGTAGAT 
AGACTATCGA 



GTATTCTGGA 
GGTCTTTAAA 
AATGAACATT 
AATCCAGGAC 
ACGGGGATTG 
TATTCAC CAG 
GCAATCTCAC 
AGCTCAATCT 
GGAGTTGCTC 
CAAAAAATTC 
AAAGCTTTCT 
CATACCCTGC 
AATTTTTCAA 
TCCCCAAAGC 
AGTGCCGGAG 
ACAGATTCGT 
ACATCGTAGA 
TCTCTAGGAG 
TAACCCTAAA 
AACACGATAT 
CCTGAATTCT 
GAGAACAGGA 
CGCCACTTTC 
TTTCCCGTGC 
TTTTGAAATC 
CAACTGCAAT 
GTTCCTGTAG 
TCCAGAACTA 
TTGTAAAAAC 
GATAAAGAGG 
GGATGTGTTT 
CCTGTATCGG 
CGCTGTGGTG 
TGGCATAGAT 
CTTTAGATTG 
GGAAAAGGTT 
TTGGTTGTCA 
TTTTACAAGC 
CACTTTATTA 
ACAGGCATTC 



The PSORT algorithm predicts inner membrane (0.16). 

The protein was expressed in Rcoli and purified as a his-tag product (Figure 85A). The recombinant 
protein was used to immunise mice, whose sera were used in a Western blot (Figure 85B) and for 
FACS analysis. 

These experiments show that cp7225 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 
Example 86 

The following C.pnewnoniae protein (pid 4377248) was expressed <SEQ ID 171- cp7248>- 

III SSS SET' -o**™" °J£S£i 

A predicted signal peptide is highlighted. 

The cp7248 nucleotide sequence <SEQ ID 172> is: 

1 ATGAAATTTT GGTTGCAAGG ATGTGCTTTT GTCGGTTGTC TGCTATTGAC 
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51 TTTACCTTGT TGTGCTGCAC GAAGACGTGC TTC TGGAGAA AATTTGCAAC 
101 AAACTCGTCC TATAGCAGCT GCAAATCTAC AATGGGAGAG CTATGCAGAA 
151 GCTCTTGAAC ATTCTAAACA AGATCACAAA CCTATTTGTC TTTTCTTTAC 
201 AGGATCAGAC TGGTGTATGT GGTGCATAAA AATGCAAGAC CAGATTTTGC 
251 AAAGCTCTGA GTOTAAGCAT TTTGCGGGTG TGCATCTGCA TATGGTTGAA 
301 GTTGATTTCC CCCAAAAGAA TCATCAACCT GAAGAGCAGC GCCAAAAAAA 
351 TCAAGAACTG AAAGCTCAAT ATAAAGTTAC AGGATTCCCC GAACTGGTCT 
401 TCATAGATGC AGAAGGAAAA CAGCTTGCTC GCATGGGATT TGAGCCTGGT 
451 GGTGGAGCTG CTTACGTAAG CAAGGTGAAG TCTGCTCTTA AACTACGTTA 
501 A 

The PSORT algorithm predicts periplasmic (0.932). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 86A) and also in 
his-tagged form. The recombinant proteins were used to immunise mice, whose sera were used in a 
Western blot (Figure 86B) and forFACS analysis. 

The cp7248 protein was also identified in the 2D-PAGE experiment. 

These experiments show that cp7248 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 87 

The following Cpneumoniae protein (pid 4377249) was expressed <SEQ ID 173; cp7249>: 

1 MIPSPTPINF RDDTILETDP KPSLIMFSSK KTE IASERRK AHPTLFKVLG 

51 TIWNIVKFII SIILFLPLAL LWVLKKTCQF FILPSSXISQ SMSKTAVAIR 

101 RMTFLSHIKQ LLSLKEISAA DRWIQYDDL WDSLAIKIP HALPHRWILY 

151 SQGNSGLMEN LFDRGDSSLH QLAKATGSNL LVFNYPGIMS SKGEAKRENL 

201 VKSYQACVRY LRDEETGPKA NQIIAFGYSL GTSVQAAALD REVTDGSDGT 

251 SWIWKDRGP RSLADVANQI CKPIASAIIK LVGWNIDSVK PSERLRCPEI 

301 FIYNSNHDQE LISDGLFERE NCVAT PFLEL PEVKTSGTKI PIPERDLLHI, 

351 NPLSPNWDR LAAVI SNYIjD SENRKSQQPD * 

The cp7249 nucleotide sequence <SEQ ID 174> is: 

1 ATGATCCCAT CCCCTACCCC AATAAACTTT CGTGATGATA CGATTCTAGA 

51 GACGGATCCA AAGCCGTCTT TAATCATGTT CTCTTCAAAA AAAACAGAGA 

101 TAGCTTCTGA AAGACGGAAG GCCCATCCCA CCTTATTTAA AGTTCTAGGA 

151 ACGATTTGGA ATATTGTGAA GTTTATTATC TCAATCATTC TGTTCCTTCC 

201 CTTAGCGTTA TTGTGGGTAC TCAAGAAAAC CTGTCAGTTT TTCATTCTCC 

251 CATCTTCTAT CATATCTCAG AGCATGTCAA AAACAGCTGT GGCAATTCGG 

3 01 CGAATGACCT TTCTGTCCCA TATTAAACAA CTCCTAAGCC TTAAGGAAAT 

351 CTCAGCTGCC GATCGTGTGG TTATACAATA TGACGATTTG GTGGTTGATA 

401 GCTTAGCTAT AAAGATACCT CATGCTCTTC CCCACAGGTG GATTCTTTAT 

451 TCTCAAGGAA ACTCTGGATT GATGGAAAAC CTGTTCGATC GGGGCGATTC 

501 CTCTCTACAC CAGCTAGCCA AAGCAACCGG CTCGAATCTT CTTGTGTTCA 

551 ACTATCCTGG AATTATGTCC AGCAAAGGAG AAGCGAAACG AGAAAATCTG 

601 GTTAAATCGT ATCAGGCATG CGTACGCTAC CTACGAGATG AAGAGACAGG 

651 TCCTAAAGCC AATCAAATCA TAGCTTTCGG ATACTCTTTG GGAACTAGTG 

701 TCCAAGCTGC TGCTCTAGAT CGTGAGGTCA CTGATGGCAG TGATGGAACT 

751 TCATGGATTG TTGTAAAAGA TCGGGGCCCT CGCTCTCTAG CAGATGTCGC 

801 GAATCA?VATT TGTAAGCCCA TAGCTTCCGC GATTATAAAA CTCGTTGGTT 

851 GGAACATAGA CTCTGTGAAA CCTAGCGAAA GATTGCGTTG TCCCGAAATT 

901 TTCATTTACA ACTCTAATCA TGATCAAGAA CTCATTAGCG ACGGCCTCTT 

951 C GAAAG AG AA AATTGCGTAG CAACACCTTT TCTAGAGCTT CCTGAAGTAA 

1001 AAACCTCGGG GACTAAAATT CCTATACCCG AAAGGGATCT TCTCCATCTA 

1051 AATCCTCTCA GTCCAAATGT AGTAGACAGA TTAGCAGCAG TGATCTCTAA 

1101 TTATTTAGAT TCTGAAAACA GAAAGTCTCA GCAAC CTGAT TAA 

The PSORT algorithm predicts inner membrane (0.571). 
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The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 87A) The 

~: to — whose sera were - * * ZL 

These experiments show that cp7249 is a surface-exposed and inaccessible protein and that it 
« a useful im munogen. These properties are not evident from the sequence alone. 
Example 88 

The following C.pneumoniae protein (pid 4377261) was expressed <SEQ ID 175- cp7261>- 
The cp7261 nucleotide sequence <SEQ ID 176> is: 

i ^ rCCCTA ^TCGATTTT ATTATTTTAT GTGATTCTAG Onn^inv. 
1 TGCCTACATA GCAGATAAGA AAAAACGAAA TGTTATTGGC 

1 tt ™attt attggtctag ttctccttct 

TCTCGTCGAA ACGCTTTAGA AAAGCCACAA AACGAtS 

The PSORT algonthm predicts inner membrane (0.848). 

The protein was expressed in Kcoli and purified a* a n<rr j ^- 

u P unnea as a GST-fusion product (Figure 88A). The 

r d to inmunise * whose sera - - * * — - 

is a useful nnmunogen. These properties are not evident from the sequence alone. 
Example 89 

The following C.pneumoniae protein (pid 43 773 05) was expressed <SEQ ID 177; cp7305>- 

1 ^ll^Tr XUO «^ HELKWSLDS CNSGWAYQEL 

= = = = = 

The cp7305 nucleotide sequence <SEQ JD 178> is: 

101 ssss ss ^ss ss 

SSSSE IE? = SESS SS2S 

™J S^ TAGAG ^^GTT GGTTAAAAAT AAGGGaS 
™! TAGAAAGCTG CAAACTCCCC AGTTCTTATG TAAACCAGGT 
mnTm ^ AA ^ATAA ATCCAAACGG CCACGTATTG 

ATGTAGATTA TCATACGCTA CATAGCAMG ACTGGGTAGT TTTCCCTATC 
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451 
501 
551 
601 



GTTTTTCAGA AAATTCCAAA GACCTCGCGT TTCAGTTATT GGTTCTCACA 

AAAAGAAACA aggaagaggg attatgtgag aaatatgctg gaccacgtca 

TTGGTTATCT AACGTCAGAA GGTGGGGAGT GGTTGCAGTA TATATCGAAA 
3 ACCTCTTATC AAAGCGCTAC TTCCTTGGAT CCTGAAAGAG TTCTTCA^ 

In?" ^f TTAACT a* 0 **™* AGCTCCAGGG AGAAGTGCAA 
701 ATGAGGAGAG TGCGACCAAA AGCTCTGGGG ATAAGGAAGT TTTgSgT 

751 catgtatctg acattatttg ccagtgttgg tggccaaagt ttcttcaIgt 
85^ ^™ ccggccttta ttgaagaatt agtagaagaa » 

10 If} AACTTAATTT AGATTTTTTA TGCCTAGAAA AGGCTAATAC ATTAGATOAr 

GAGTTGAGAA ACAGTCTTCT AAGAGCAGTC GTACACcIcS 

inni ** aU »«™ GTGCCGGCCT CATTATTTAT ACGG^TA 

1001 TTCAATTACA GATTCCCTTC TCAAGGAGTT AA »v.-w a fl« s c ra 

The PSORT algorithm predicts inner membrane (0.508). 



15 



20 



The protein was expressed in Rcoli and purified as a GST-fusion product (Figure 89A) and also as a 
double GST/his fusion. The recombinant proteins were used to immunise mice, whose sera were 
used m a Western blot (Figure 89B) and for FACS analysis. 

These experiments show that cp7305 is a surface-exposed and immunoaccessible protein, and tr- it 
is a useful immunogen. These properties are not evident from the sequence alone. 
Example 90 

The following C.pneumoniae protein (pid 437734?) was expressed <SEQ ED 179; cp7347> : 

MKKGKLGAIV FGLLFTSSVA GFSKDLTKDN AYQDLNVIEH LISLKYAPLP 
WKELLFGWDL SQQTQQARLQ LVLEEKPTTN YCQKVLSNYV RSMDYHAGI 
TFYRTESAYI PYVLKLSFXm HvtmmMo ^t, t „ „ V™ 1 



51 
101 



uvuoijArrnM luyftViiSNYV RSLNDYHAGI 

151 It^tp^ fVLKLSEDG HVFWDVQTS QGDIYLGDEI LBvEL^ 

25 If} AIESLRFGRG SATDYSAAVR SLTSRSAAFG DAVPSGIAML KLRRPSGLIR 

201 STPVRWRYTP EHIGDFSLVA PLIPEHKPQI, PTQSCVLFRS GWSOSSSSS 

251 LFSSYMVPYF WEELRVQNKQ RFDSNHHIGS RNGFLPTFGP SdIIpy 

301 RSYIFKAKDS QGNPHRIGFL RISSYVWTDL EGLEEDHKDS pSStD 

351 HliEKETDALl IDQTHNPGGS VFYLYSLLSM LTDHPLDTPK S^EV 

30 t°J SSALHWQDLL EDVFTDEQAV AVLGETMEGY CMDMHAVASL O^SOsS 

30 !" WVSGDlNIiSK PMPLLGFAQV RPHPKHQYTK PLFmSS fSSSS 

501 LKDNGRATLI GKPTAGAGGF VFQVTFPNRS GIKGLSLTGS rlvRTOG^ 

Si 5££S 55EK£ SRFTDYVEAV K.XV^SrJ SSS 

A predicted signal peptide is highlighted. 
35 The cp7347 nucleotide sequence <SEQ ID 1 80> is: 

ATGAAAAAAG GGAAATTAGG AGCCATAGTT TTTGGCCTTC TATTTAOaar 

tagtgttgct ggtttttcta aggatttgac taaagacaIc ™tca^g 

ATTTAAATGT CATAGAGCAT TTAATATCGT TAAAATATGC TCCTTTACCA 
TGGAAGGAAC TATTATTTGG TTGGGATTTA TCTCAGCAAA CACAGCA^rC 
TCGCTTGCAA CTGGTCTTAG AAGAAAAACC AACAACCAAC ^ACTGCCA^A 
AC™^r TAACTACGTG AGATCATTAA ACGATTATCA „ 
ACGTTTTATC GTACTGAAAG TGCGTATATC CCTTACGTAT TGAAGTTAAG 
TGAAGATGGT CATGTCTTTG TAGTCGACGT ACAGACtIgC CAAGGGGATA 
4 °l ™ C ™G GGATGAAATC CTTGAAGTAG ATGGAATGGG GMTCGtcIg 
45 GCTATCGAAA GCCTTCGCTT TGGACGAGGG AGTGCCACAG ACTATtS 

501 TGCAGTTCGT TCCTTGACAT CGCGTTCCGC CGCTTTTGGA gSgOTTC 
CTTCAGGAAT TGCCATGTTG AAACTTCGCC GACCCAGTGG TTTGATCCGT 

tcgacaccgg tccgttggcg watactcca gagcatatcI gSttttc 

TTTAGTTGCT CCTTTGATTC CTGAACATAA ACCTCAAOTA CCTAcIcaII 
GTTGTGTGCT ATTCCGTTCC GGGGTAAATT CACAGTCTTC TAGTAG™ 
TTATTCAGTT CCTACATGGT GCCTTATTTC TGGGAAGAAT SgTtS 
SSac ™ CACCA Ta .TAGGGAGC 

901 CGT^™* GTTTGGTCCT ATTCTTTGGG AACAAGACAA GGGGCCCTAT 
901 CGTTCCTATA TCTTTAAAGC AAAAGATTCT CAGGGCAATC CCCATCGCAT 
951 AGGATTTTTA AGAATTTCTT CTTATGTTTG GACTGATTTA GAAGGACTTT 
1001 AAGAGGATCA TAAGGATAGT CCTTGGGAGC Sc^ 
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1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 



CATTTGGAAA AAGAGACTGA TGCTTTGATT ATTGATCAGA CCCATAATCC 
TGGAGGCAGT GTTTTCTATC TCTATTCGTT ACTATCTATG 
ATCCTTTAGA TACTCCTAAA CATAGAATGA TTTTCACTCA GGATGAAGTC 
AGCTCGGCTT TGCACTGGCA AGATCTACTA GAAGATGTCr TCACAGATGA 
GCAGGCAGTT GCCGTGCTAG GGGAAACTAT GGAAGGATAT T^ATGGMA 
*™^ G ? AGCCTCTCTT CAAAACTTCT CTCAGAGTGT COTTTC^C 
TGGGTTTCAG GTGATATTAA CCTTTCAAAA CCTATGCCTT TGCTar™ 
14M !? CACAGGTT CGACCTCATC CTAAACATCA ATAtIcTaH COTTTG^ 
10 «„i ^ GTTGATAG * CGAGGATGAC TTCTCTTGTG GAGATTTAGC 

^ ™ GAAGGAT * ATGGCCGCGC TACTCTCATT GGAAAGCCAA CaSS 
lilt ^ AGGTTTT GTATTCCAAG TCACTTTCCC TAACCGTTCT 
■ I «t OICTTICW AACAGGATCT TTAGCTGTTA GGAAAGATGG 
1651 GAAAACTTAG GAGTGGCTCC TCATATTGAT TTAGGATTTA CC^CcIgGgI 
15 \nl, ^TTGCAAACT TCCAGGTTTA CTGATTACGT TGAGGCAGTG AAAACTATAG 

TTTTAACTTC TTTGTCTGAG AACGCTAAGA AGAGTGAAGA GCAGACTTCT 
1851 TGCTTCGTAA 0Q0CWU » W TATTCGAGTC TCTTATCCCA SSSS 

The PSORT algorithm predicts periplasmic space (0.2497). 

The protei expressed in E coU and pur . fied ^ a GsT fijs . on prodoct ^ ^ ^ 

20 his-tagged form. The recombinaru proteins were used to immunise mice, whose sera were used in a 
Western blot (Figure 90B) and for FACS analysis. 

These experiments show that c P 7347 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 
Example 91 

25 The following C.pneumoniae protein (pid 43 77353) was expressed <SEQ ID 18 1; cp7353> : 

51 SlsSo f^™ 8 STVSTASG ^ KTATGEVLVS CTALEGSSST ' 
ini Q IIllAT W S ^ I.QSTNVHQLL FLPPEWELE IQWDLLVOL 

™^ SEP QTOQSRSE QTLPQQSSSK QSAWPRSI* PEISDsS 
30 H ALQTPKDSAV RKHSEAPSPE TQARASLSQA SSSSQRSLPP OESAPER^S 

^! oS8^ SSFSP ls Q fsae kq K eadttsksSe lvkerdqdrq §SSS 

JiTTFKKKIjPS PMSVFSRPIP SKNPLSVGSS IHGPIOTPKV FMWTSpmti 
^RILGQAEA EANEIiYWRVK QRTDDVDTLT VLISKlS 
MKALLNRAKE IGVTIDKEKY TWTEEEKRLL KKWQMRKEN SkIToS 
DMQRHLQEIS QCHQARSNVL KLLKELMDTF lYNLRP*^ "EKITQMERT 

The cp7353 nucleotide sequence <SEQ ID 1 82> is: ' 

A ™To TGC CTGTT CCTTC TGCAGTTCCC TCTGCAAATA TAACTCTAAA 
A ^™^ AGC TCAACAG ^TT CCACAGCCTC TGGAATATTA AAGACTGCAA 
™^ GT CTTAGTCTG * TGTACAGCGC TAGAAGGAAG CTCTTCTACA 
f™^™ TOAGC ™ G WAGGACAA ATCATTCTOG CGACCCA^CA 
AGAACTGCTC TTACAAAGCA CAAATGTTCA TCAACTCCTC TTCCTCC^TC 
CTGAAGTTGT AGAATTAGAA ATCCAAGTTG TTGACTTGCT 
,„ GAAGATGCAG AGACAATCAC AAGTGAACCA CAAGAAACAC 

45 am ™ AGTGAG c «»««k:c ctcaacaaag cagcagtaaa caS^c 

1\ TCTCCCCACG CTCCTTAAAA CCTGAAATTT CTGATTCTAA ACaIcaGOAa 

50 ACAC^ CTCTGCTCTA AGAaUcI^ GCGA^GCACC 
GTCACCTGAG ACACAAGCTC GCGCTTCCTT ATCTCAGGCA arrvw-aao™ 

551 CTCAGAGATC CTTACCTCCG CAAGAAAGTG CGCcfoA^G SSI 

50 «J GAACAACAAA AAGCAAGCTC CTTCTCTCCT CTATCCCAGT TCTCTGCAGA 

50 651 GAAACAAAAA GAGGCCCTGA CGACCTCAAA ATCTCATGM CTCTA^^r 

701 AACGCGATCA AGATCGCCAA CAAAGAGAGC AGcS AAAGCACGAT 

PHI ^AGGAAGAAG ACGCTGAATC TAAAAAGAAA AAGAAGAAAC GTGGTCTCGG 

55 901 AGATCAAATG CGACCTCCTG CTGAAGAAAC TTCTAAAAAA 

GAAACGACAT TCA AAAAGAA GCTACCTTCT CCAATGTCTG TGTTTAGCAG 
951 ATTCATCCCT AGTAAGAATC CGTTATCTGT AGGCTCTTCA IScS 
1001 CTATACAAAC TCCAAAAGTA GAAAATGTGT TCTT^TT CaSgCTC 
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1051 ATGGCAAGAA TCTTAGGCCA AGCCGAAGCC GAAGCTAATG AACTCTACAT 

1101 GCGAGTCAAA CAACGTACCG ATGATGTAGA CACACTCACA GTCCTTATCT 

1151 CTAAGATCAA TAATGAAAAG AAAGACATTG ATTGGAGTGA AAATGAAGAG 

1201 ATGAAAGCTC TTTTAAATCG AGCTAAAGAG ATTGGAGTCA CTATAGACAA 

" AGAAAAATAT ACTTGGACAG AAGAGGAAAA AAGACTTCTA AAAGAGAATG 

1301 TCCAAATGCG CAAAGAGAAT ATGGAGAAAA TCACTCAAAT GGAaKgGACG 

i 2ni ™ MGCAAA ggcac ^cca agagatttct caatg™ aag^^ctc 

itll ™CCCTA ^ OTTATTOA GGACACCTTC ATTTACAACC 

10 The PSORT algorithm predicts cytoplasm (0.1308). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 91A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
9 IB) and for FACS analysis. 

These experiments show that c P 7353 is a surface-exposed and immunoaccessible protein, and that it 
15 is a useful immunogen. These properties are not evident from the sequence alone. 

Example 92 



The following Cpneumoniae protein (pid 4377408) was expressed <SEQ ID 183; cp7408>: 

MLKIQKKRMC VSWITVGAI VGFFNSADAA PKKKKIPIQI LYSFTKVSSY 
LKNEDASTIF CVDVDRGLLQ HRYLGSPGWQ ETRRRQLFKS LENQSYGNER 
LGEETLAIDI FRNKECLESE IPEQMEAIIA NSSALVLGIS SFGITGIPAT 



1 MLKIQKKRMC VSWITVGAI VGFFNSADAA PKKKKIPIQI LYSFTKVSSY 
101 

?m FQKRSIASES FLLKIDSAPS DASVFYKGVL FRGETAIVDA 

201 LSQLFAQLDL SPKKIIFLGE DPEWQAVGS AC I GVtfGMNFL GLVYYPAQES 

251 LFSYVHPYST ATELQEAQGL QVISDEVAQL T LNALPKMN * 

The cp7408 nucleotide sequence <SEQ ID 184> is: 

i 

51 



ATGTTGAAAA TCCAGAAAAA AAGAATGTGT GTCAGCGTAG TCATCACGGT 

ioi a™^^ gtggggtttt ^aattotgc AGACGCAGCA ccaaagaaaa 

101 AGAAGATCCC TATACAGATT CTCTACTCCT TTACTAAAGT CTCTTCCTAT 
151 TTAAAAAACG AAGACGCAAG TACTATATTT TGCGTCGATG TGGATCGTGG 
ACTTCTCCAG CATCGGTATT TAGGTAGTCC AGGATGGCAG GAAACCAGAC 
GTCGGCAGTT ATTTAAATCC TTAGAAAATC AATCATACGG CAACGAACGT 
TTAGGAGAAG AAACTCTTGC TATTGATATT TTCAGGAACA AAGAGTGCTT 
GGAGAGCGAG ATCCCAGAGC AGATGGAAGC TATCCTTGCA AATTCCTCGG 
CCTTGGTCTT AGGCATCTCT TCTTTTGGGA TCACAGGAAT TCCTGCGACT 
QC til TTGCATAGTT TGCTTCGACA GAATCTATCT TTCCAAAAAC GCTCTATAGC 

35 It] ^GGAGAGC TTCCTT ™ AGATCGATAG TGCCCCCTCA gSgCCTC^ 

551 TTTTTTATAA AGGCGTGCTT TTCCGCGGAG AGACTGCGAT CGTGGATGCG 
TTAAGCCAAT tatttgccca GCTCGATCTT tctcctaaaa aaattatctt 
TCTAGGAGAA GACCC TGAGG TCGTTCAAGC TGTTGGGTCT GCTTGTATAG 
GTTGGGGCAT gaacttttta GGCCTGGTAT actatcctgc TCAAGAAAGC 
CTTTTTTCTT ATGTTCATCC TTACTCTACA GCAACGGAGC TCCAAGAAGC 

fim n~^ TTA CAAGTAATTT CAGATGAAGT CGCACAGCTT ACTTTAAACG 
851 CTCTTCCGAA AATGAATTAA 

The PSORT algorithm predicts inner membrane (0.123). 

The protein was expressed in Kcoli and purified as a his-tag product (Figure 92A). The recombinant 
45 protein was used to immunise mice, whose sera were used in a Western blot (Figure 92B) and for 
FACS analysis. 

These experiments show that cp7408 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 
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Example 93 

The following Cpmumoniae protein (pid 4376424) was expressed <SEQ ID 185; cp6424>: 

1 MMHNIWLSE EPGRSAFLGR TAFFPNKYPI AQGGVGIPST IGNLFTIWYC 
51 FYFYRAATPQ SDHPDGCGFI LLERLKELGA GFFYCDLRES NTTGFTLFFE 
101 GSNKGVLKHH LFIRDE* 

The cp6424 nucleotide sequence <SEQ ID 186> is: 

1 ATGATGCACA ATATTGTTGT TCTTAGTGAG GAACCTGGAC GAAGCGCTTT 

51 TCTTGGTAGG ACGGCATTTT TCCCTAATAA GTATCCAATA GCTCAGGGTG 

101 GTGTTGGAAT ACCATCTACA ATAGGCAATC TCTTTACTAT ATGGTACTGT 

,ni ™ ATTTTT ATAGAGCTGC AACTCCACAA TCTGATCATC CTGACGGATG 

201 TGGCTTTATT CTACTAGAAA GGCTTAAGGA GCTCGGTGCA GGGTTCTTTT 

251 ATTGTGATCT TCGTGAGTCC AATACCACTG GCTTTACTCT TTTTTTTGAA 

301 GGCTCCAATA AAGGTGTGTT AAAGAATCAC TTGTTTATTA GAGATGAGTA 

The PSORT algorithm predicts cytoplasm (0.2502). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 93A) and also in 
hrs-tagged form. The recombinant proteins were used to immunise mice, whose sera were used in 
Western blots (Figure 93B) and for FACS analyses (Figure 93C; GST-fusion). 

These experiments show that cp6424 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 94 

The following Cpneumoniae protein (pid 4376449) was expressed <SEQ ID 187; cp6449>: 



1 VASETYPSQI IiHAQREVKDA YFNQADCHPA RANQILEAKK ICLLDVYHTN 

51 HYSVFTFCVD NYPNLRFTFV SSKNNEMNGL SNPLDNVLVE AMVRRTHARN 

101 LIAACKIRNI EVPRWGLDL RSGILISKLE LKQPQFQSLT EDFVWHSTNO 

151 EEARVHQKHV LLISLILLCK QAVLESFQEK KRSS* VWHSTNQ 

The cp6449 nucleotide sequence <SEQ DD 188> is: 

GTGGCGTCTG AAACGTATCC TTCTCAGATA TTGCACGCTC AGAGGGAAGT 
ACGTGATGCC TATTTTAATC AAGCGGATTG CCATCCTGCT CGGGCTAATC 
AGATTCTCGA GGCTAAGAAA ATCTGTTTAT TAGATGTTTA TCATACTAAT 
CATTATTCCG TATTTACTTT TTGTGTAGAT AATTATCCGA ATCTCCGCTT 
TACATTTGTA TCTTCAAAAA ACAATGAGAT GAATGGCTTA TCTAATCCTC 
251 TAGATAATGT TCTTGTAGAG GCTATGGTAC GTAGAACACA TGCAAGAAAC 
301 CTACTTGCAG CGTGTAAAAT TCGAAATATT GAGGTTCCAA GGGTTGTTGG 
351 GCTTGACCTA AGATCTGGGA TACTCATTTC GAAACTAGAA TTGAAGCAAC 
401 CTCAGI-TCCA AAGTTTAACA GAAGACTTCG TAAATCATTC CACAAATCAG 
451 GAAGAAGCTC GCGTCCATCA AAAGCATGTG TTGCTAATTT CTTTAATTTT 
501 ACTTTGCAAG CAGGCCGTTC TGGAATCATT CCAGGAAAAA AAGCGATCCT 
o 5 1 CTTAA 

The PSORT algorithm predicts inner membrane (0.2084). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 94A) and also in 
his-tagged form. The recombinant proteins were used to immunise mice, whose sera were used in 
Western blots (Figure 94B) and for FACS analyses (Figure 94C; GST-fusion). 

These experiments show that c P 6449 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 
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Example 95 

The following ^pneumoniae protein (pid 4376495) was expressed <SEQ ID 189; cp6495>: 
5 The cp6495 nucleotide sequence <SEQ ID 190> is: 

™ a ™™ cttatgatgataa ^ 

The PSORT algorithm predicts cytoplasmic (0.280), 

The protein was expressed in Kcoli and purified as a GST^fusion product (Figure 95A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
95B) and for FACS analysis (Figure 95C). 

These experiments show that c P 6495 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 96 

The following C.pneumoniae protein (pid 4376506) was expressed <SEQ ID 191; cp6506>: 



9n 1 MKRFLFLILS SLPLVAF SAD NFTILEEKQS PLSRVSIIFA LPGVTPVSFD 

AK) 51 GNCPIPWFSH SKKTLEGQRI YYSGDSFGKY FWSALWPNK VSSAWACNM 

101 ILKHRVDLIL, IIGSCYSRSQ DSRFGSVLVS KGYINYDADV RPFFERFEIP 

151 DIKKSVFATS EVHREAILRG GEEFISTHKQ EIEELLKTHG YLKSTTKTEH 

2 01 TLtMEGLVATG ESFAMSRNYF LSLQKLYPEI HGFDSVSGAV SQVCYEYSIP 
251 CLGVNILLPH FLjESRSNEDW KHLQSEASKI YMDTLLKSVL KELCSSH* 

25 The cp6506 nucleotide sequence <SEQ ID 192> is: 

1 ATGCGTCGTT TTCTGTTTCT TATTCTTAGC TCTCTTCCTT TGGTCGCATT 

51 CTCTGCTGAT AATTTCACTA TTCTAGAAGA AAAACAGAGT CCSTTAAGTC 

101 GTGTAAGTAT TATTTTTGCT TTACCTGGGG TTACTCCCGT TTCTTTTGAT 

X \ X GGTAATTG,r C CTATTCCTTG GTTTTCTCAT AGTAAAAAGA CTCTAGAGGG 

5U 201 ACAGAGAATT TATTACTCTG GCGACTCCTT TGGGAAATAC TTTGTAGTTT 

251 CTGCTCTTTG GCCTAATAAA GTTTCTTCAG CTGTTGTGGC TTGTAATATG 

301 ATTCTTAAAC ATCGAGTGGA TCTTATTCTA ATTATAGGCT CGTGTTACTC 

3 51 TAGGTCTCAA GATAGCCGTT TTGGCAGCGT CTTAGTITCT AAAGGCTACA 
401 TTAATTATGA TGCAGATGTG AGGCCTTTCT TTGAAAGATT TGAGATTCCA 

53 451 GACATTAAAA AGAGTGTTTT TGCAACCAGT GAGGTTCATC GGGAGGCAAT 

501 TCTTCGTGGA GGCGAAGAGT TTATTTCTAC CCATAAACAA GAAATCGAAG 

5 51 AGCTTTTGAA GACTCATGGG TATTTGAAAT CAACAACCAA AACGGAGCAC 

601 ACCTTAATGG AAGGTTTGGT TGCTACAGGC GAGTCTTTCG CGATGTCGCG 

651 AAACTATTTT CTTTCCTTAC AAAAATTGTA TC CAGAGATT CATGGTTTTG 

VK) 701 ATAGTGTCAG CGGCGCTGTT TCTCAGGTAT GCTATGAATA TAGCATTCCT 

751 TGTTTAGGTG TGAATATCCT TCTCCCTCAT CCTTTAGAAT CACGGAGTAA 

801 CGAGGATTGG AAGCATCTTC AAAGTGAGGC AAGTAAAATT TATATGGATA 

851 CCTTGCTCAA GAGTGTATTA AAAGAACTCT GTTCTTCTCA TTAA 

The PSORT algorithm predicts periplasmic space (0.571). 

■5 The protein was expressed in Kcoli and purified as his-tag (Figure 96A) and GST-fusion (Figure 
96B) products. The GST-fusion protein was used to immunise mice, whose sera were used in a 
Western blot (Figure 96C) and for FACS analysis (Figure 96D). 
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These experiments show that cp6506 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 97 

The following Cpneumoniae protein (pid 4376882) was expressed <SEQ ID 193; cp6882>: 

,! ^f^ LPSSQ DSASEDST SQ SQIFDPIRNR ELVSTPEEKV RQRLLSFLMH 

51 KLNYPKKLII IEKELKTLFP LLMRKGTLIP KRRPDXLIIT PPTYTDAOGN 

101 THNLGDPKPL Uh IECKALAV WQNALKQLLS YNYSIGATCI AMAGKHSOVS 

151 ALPNPKTQTL DFYPGLPEYS QLLNYFISUH L* W 

The cp6882 nucleotide sequence <SEQ ID 194> is: 

1 ATGTCCTTAT TGAACCTTCC CTCAAGCCAG GATTCTGCAT CTGAGGACTC 

51 CACATCGCAA TCTCAAATCT TCGATCCCAT TAGAAATCGG GAGTTAGTTT 

101 CTACTCCCGA AGAAAAAGTC CGCCAAAGGT TGCTCTCCTT C CTAATGC AT 

151 AAGCTGAACT ACCCTAAGAA ACTCATCATC ATAGAAAAAG AACTCAAAAC 

201 TCTTTTTCCT CTGCTTATGC GTAAAGGAAC CCTAATCCCA AAACGCCGCC 

;l] ^AGATATTCT CATCATCACT CCCCCCACAT ACACAGACGC ACAGGGAAAC 

301 ACTCACAACC TAGGCGACCC AAAACCCCTG CTACTTATCG AATGTAAGGC 

351 CTTAGCCGTA AACCAAAATG CACTCAAACA ACTCCTTAGC TATAACTACT 

401 CTATCGGAGC CACCTGCATT GCTATGGCAG GGAAACACTC TCAAGTGTCA 

451 GC TCTCTTC A ATCCAAAAAC ACAAACTCTT GATTTTTATC CTGGCCTCCC 

501 AGAGTATTCC CAACTCOTAA ACTACTTTAT TTC TTTAAAC TTATAG 

The PSORT algorithm predicts cytoplasm (0.362). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 97A). The protein 
was used to immunise mice, whose sera were used in a Western blot (Figure 97B) and for FACS 
analysis (Figure 97C). 

These experiments show that cp6882 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 98 

The following ^pneumoniae protein (pid 437 697 9) was expressed <SEQ ID 195; cp6979>: 

^ ^™ SK ^^GA™ QHPDVKESGV TSANLGSHRV TASGGRQGLL 

51 ARIKEAVTGF FSRMSFFRSG APRGSQQPSA PSADTVRSPL PGGDARATEG 

101 AGRNLIKKGY QPGMKVTIPQ VPGGGAQRSS GSTTLKPTRP APPPPKTGGT 

151 NAKRPATHGK GPAPQPPKTG GTNAKRAATH GKGPAPQPPK GILKOPGOSG 

201 TSGKKRVSWS DED* ^xi^UFGQSG 

The cp6979 nucleotide sequence <SEQ ID 196> is: 

1 ATGTCTGTTA ATCCATCAGG AAATTCCAAG AACGATCTCT GGATTACGGG 

51 AGCTCATGAT CAGCATCCCG ATGTTAAAGA ATCCGGGGTT ACAAGTGCTA 

101 ACCTAGGAAG TCATAGAGTG ACTGCCTCAG GAGGACGCCA AGGGTTATTA 

151 GCACGAATCA AAGAAGCAGT AACCGGGTTT TTTAGTCGGA TGAGCTTCTT 

201 CAGATCGGGA GCTC CAAGAG GTAGCCAACA ACCCTCTGCT CCATCTGCAG 

251 ATACTGTACG TAGCCCGTTG CCGGGAGGGG ATGCTCGCGC TACCGArrra 

301 GCTGGTAGGA ACTTAATTAA AAAAGGGTAC C^CCAGGGA TGAAAGTCAC 

351 TATCCCACAG GTTCCTGGAG GAGGGGCCCA ACGTTCATCA GGTAGCACGA 

401 CACTAAAGCC TACGCGTCCG GCACCCCCAC CTCCTAAAAC GgS5 
™ OTCCGGCAAC GCACGGGAAG GGTCCAGCAC CCCAGCCTCC 

rr j l^^ttl GGGACCAATG CTAAGCGCGC AGCAACGCAT GGGAAAGGTC 

551 CAGCACCTCA ACCTCCTAAG GGCATTTTGA AACAGCCTGG GCAGTCTGGG 

601 AC TTCAGGAA AGAAGCGTGT CAGCTGGTCT GACGAAGATT AA ™ G 

The PSORT algorithm predicts cytoplasm (0.360). 
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The protein was expressed in E.coli and purified as a GST-fusion product (Figure 98A). The GST- 
fusion protein was used to immunise mice, whose sera were used in a Western blot (Figure 9SB) and 
for FACS analysis (Figure 98C). 

These experiments show that cp6979 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 99 

The following C.pneumoniae protein (pid 43 77028) was expressed <SEQ ID 197; cp7028>: 

1 MLLGFLCDCP CASWQCAAVA NCYDSVFMSR PEHKPNIPYI TKATRRGURM 

51 KTLAYIASLK DARQLAYDFL KDPGSLARLA KALIAPKEAL QEGNLFFYGC 

101 SNIEDILEEM RRPHRILLLG FSYCQKPKAC PEGRFNDACR YDPSHPTCAS 

151 CSIGTMMRLN ARRYTTVIIP TFIDIAKHLH TLKKRYPGYQ ILFAVTACELi 

osi l^f GDYAS V™* 1 *™™** LTGRICNTFK AFKLAERGVK PGVTILEEDG 
251 FEVIiARILTE YSSAPFPRDF CEIH* 

The cp7028 nucleotide sequence <SEQ ID 198> is: 



1 ATGCTTCTAG GGTTTTTGTG TGACTGCCCC TGTGCTTCGT GGCAGTGTGC 

51 GGCCGTTGCT AATTGTTATG ATTCCGTATT TATGTCTAGA CCAGAGCACA 

101 AACCTAATAT TCCTTATATT ACTAAAGCTA CAAGACGGGG TCTGCGTATG 

oni i^SSlS 00 ™ 3 CTTATCTG GC CTCTTTAAAA GATGCTAGAC AGCTTGCCTA 

20 TGATTTTCTG AAAGATCCTG GTTCTTTAGC TCGGTTAGCT AAGGCTTTGA 

ZU 251 TAGCTCCTAA GGAGGCCTTA CAGGAGGGCA ACCTATTTTT TTATGGCTGT 

301 AGTAATATTG AGGATATTTT AGAGGAGATG CGTCGTCCTC ATAGAATCCT 

351 TTTGTTAGGA TTTTCTTATT GTCAAAAGCC TAAGGCATGT CCTGAAGGGC 

401 GTTTCAATGA TGCTTGTCGG TATGATCCTT CACATCCTAC ATGTGCCTCA 

OS til TCTTCTATAG GGACCATGAT GCGGCTGAAT GCTCGTAGAT ACACTACTGT 

^ 501 GATCATCCCT ACATTTATAG ATATCGCAAA ACATTTACAC ACTTTAAAAA 

551 AGCGCTACCC TGGATATCAA ATTCTCTTTG CAGTTACTGC TTGTGAACTT 
TCCTTAAAAA TGTTTGGAGA TTATGCCTCC GTAATGAACT TAAAGGGTGT 



601 
651 



GGGCATCAGA CTCACAGGAC GTATTTGCAA TACATTTAAG GCATTTAAAT 
30 ™] !^ GAGCG AGGAGTCAAA CCAGGAGTCA CTATCCTAGA AGAAGATGGC 

iV l 5 l TTTGAGGTAT TAGCAAGGAT TCTTACAGAA TACAGTAGCG CTCCTTTCCC 

801 TAGAGACTTT TGTGAGATCC ATTAG ^^HTCCC 

The PSORT algorithm predicts cytoplasm (0.1453). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 99A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
35 99B) and for FACS analysis (Figure 99C). 

These experiments show that c P 7028 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 100 

The following C.pneumoniae protein (pid 4377355) was expressed <SEQ ID 199; cp7355>: 

1 MKKWTLSII FFATYCASEL SAVTWAVPL SEAPGKIQVR PWGLOFOEE 
51 QGSVPYSFYY PYDYGYYYPE TYGYTKNTGQ ESRECYTRFE DGT1FYECD* 

The c P 7355 nucleotide sequence <SEQ ID 200> is: 

1 ATGAAGAAAG TCGTAACACT ATCCATTATA TTTTTCGCAA CGTATTGTGC 

51 ATCAGAGCTT AGTGCTGTAA CTGTAGTGGC TCTGCCTTTA TCAGAGGCTC 

15 101 CAGGGAAGAT TCAAGTTCGT CCCGTCGTTG GTCTGCAATT TCAAGAAGAA 

151 CAGGGTTCTG TGCCCTATAG TTTTTATTAT CCTTATGACT ATGGGTATTi 

201 CTATCCAGAG ACTTATGGCT ATACTAAAAA TACAGGTCAA GAAAGTCGCG 
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251 AATGTTATAC CCGATTTGAA GATGGCACAA TTTTTTATGA ATGCGATTAG 

The PSORT algorithm predicts inner membrane (0.143). 

The protein was expressed in Rcoli and purified as a GST-fusion (Figure 100A) and a his-tag 
product. The proteins were used to immunise mice, whose sera were used in a Western blot (Figure 
100B) and for FACS analysis (Figure 100C). 

These experiments show that cp7355 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 101 

The following C.pneumoniae protein (pid 4377380) was expressed <SEQ ID 201; cp7380>: 

1 VHYCERTLDP KYILKIALKL RQSLSLPFQN SQSLQRAYST PYSYYRIILO 
101 ^S^r?^ f^^l 1 ™! WOlIiFvTnj, SLSKNQREGC STDMAWSTP 



FFNRMLWYRL LSSRFSLWKS YCPRFFLDYL EAFGLLSDFL DHQAV1KFFE 
151 IiETHFSYYPV SGFVAPh^VT, SLLQDRYFPI ASVMRTLDKD NFSLTPDLIH 
15 201 DLLGHVPWLL HPSFSEFFIN MGRLFTKVTE KVQALPSKKQ RIQTLQSNLI 

251 AITOCFWFTV ESGLIENHEG RKAYGAVLIS SPQELGHAFI DNVRVLPLEL 
3" ESSE C S ^ QETLFSI SKLEWMLDQG ^ESXPI^Q 



The cp7380 nucleotide sequence <SEQ ID 202> is: 
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GTGCACTACT GCGAGAGAAC CCTGGACCCA AAGTATATTC TGAAGATTGC 
TCTAAAGCTG AGACAATCAC TTTCCCTGTT CTTCCAGAAC AGCCAATCAC 
\l\ ^o^r GC ATACTCGACC CCATATTCCT ACTACCGAAT CATTCTACAA 
151 AAGGAAAATA AAGAGAAGCA AGCTTTAGCT CGACACAAAT GCATTTCTAT 
201 TTTAGAATTT TTCAAAAACT TACTCTTTGT TCATCTTCTG TCATTATCAA 
9 c * = AGAATCAAAG GGAAGGTTGC TCCACTGATA TGGCTGTTGT AAGCACTCCC 

lZ> 301 TTTTTTAATC GGAATTTATG GTATCGACTC CTTTCCTCAC GGTTTTCTCT 

ATGGAAAAGC TATTGTCCAA GATTTTTTCT TGATTACTTA GAAGCTTTCG 
GTCTCCTTTC TGATTTCTTA GACCATCAAG CAGTCATTAA ATTCTTCGAA 
TTAGAAACAC ATTTTTCCTA TTATCCCGTT TCAGGATTTG TAGCTCCCCA 
TCAATACTTG TCTCTGTTGC AGGACCGTTA CTTTCCCATT GCCTCTGTAA 
TGCGAACTCT CGATAAAGAT AATTTCTCCT TAACTCCTGA TCTCATCCAT 
601 GACCTTTTAG GGCACGTGCC TTGGCTTCTA CATCCCTCAT TTTCTGAATT 
651 TTTCATAAAC ATGGGAAGAC TCTTCACTAA AGTCATAGAA AAAGTACAAC 
7 01 CTCTTCCTAG TAAAAAACAA CGCATACAAA CCCTACA^ CAATCTGATC 
35 Rnt ™™ GCTGCTTTTG GTTTACTGTT GAAAGCGGAC TTATTGAAAA 

M 801 CCATGAAGGA AGAAAAGCAT ATGGAGCCGT TCTTATCAGT TCTCCTCAGG 

851 AACTTGGACA CGCTTTCATT GATAACGTAC GTGTTCTCCC TTTAGAATTG 
901 GATCAGATTA TTCGTCTTCC CTTCAATACA TCAACTCCAC AAGAGAC^ 
, *™AATA AGACATTTTG ATGAACTGGT AGAACTCACT TCAAAATTAG 

™2J AATGGATGCT CGACCAAGGT CTGTTAGAAT CAATTCCCCT TTACAATCAA 
40 105 1 GAGAAATATC TTTCTGGTTT TGAGGTACTT TGCCAATGA 1TACAATCAA 

The PSORT algorithm predicts inner membrane (0. 1362). 

The protein was expressed in Rcoli and purified as a GST-fusion product (Figure 101A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
101B) and for FACS analysis (Figure 101C). 

45 These experiments show that c P 7380 is a surface-exposed and inimunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 
Example 102 

The following C.pneumoniae protein (pid 4376904) was expressed <SEQID 203; c P 6904>: 
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1 MMNYEDAKLR GQAVAILYQI GAIKFGKHIL ASGEETPLYV DMRLVISSPE 

51 VLQTVATLIW RLRPSFNSSL LCGVPYTALT LATSISLKTO IPMVLRRKEL 

101 QNVDPSDAIK VEGLFTPGQT CLVINDMVSS GKSIIETAVA LEENGLWRE 

151 ALVFLDRRKE ACQPLGPQGI KVSSVFTVPT LIKALIAYGK LSSGDLTLAN 

~> 201 KISEILEIES * 

The cp6904 nucleotide sequence <SEQ ID 204> is: 

1 ATGATGAAC T ACGAAGATGC AAAATTACGC GGTCAAGCTG TAGCAATTCT 

51 ATACCAAATC GGAGCTATAA AGTTCGGAAA ACATATTCTC GCTAGCGGAG 

in 101 AAGAAACTCC TCTGTATGTA GATATGCGTC TTGTGATCTC CTCTCCAGAA 

1U 151 GTTCTCCAGA CAGTGGCAAC TCTTATTTGG CGCCTCCGCC CCTCATTCAA 

201 TAGTAGCTTA CTCTGCGGAG TCCCTTATAC TGCTCTAACC CTAGCAACCT 

251 CGATCTCTTT AAAATATAAC ATCCCTATGG TATTGCGAAG GAAGGAATTA 

301 CAGAATGTAG ACCCCTCGGA CGCTATTAAA GTAGAAGGGT TATTTACTCC 

i<r 351 AGGACAAACT TGTTTAGTCA TCAATGATAT GGTTTCCTCA GGAAAATCTA 

lD 401 TAATAGAGAC AGCAGTCGCA CTGGAAGAAA ATGGTCTGGT AGCTCGTGAA 

451 GCATTGGTAT TCTTAGATCG TAGAAAAGAA GCGTGTCAAC CACTTGGTCC 

501 ACAGGGAATA AAAGTCAGTT CGGTATTTAC TGTACCCACT CTGATAAAAG 

551 CTTTGATCGC TTATGGGAAG CTAAGCAGTG GTGATCTAAC CCTGGCAAAC 

601 AAAATTTCCG AAATTCTAGA AATTGAATCT TAA 

20 The PSORT algorithm predicts cytoplasm (0.0358). 

The protein was expressed in E.coli and purified as a his-tag product (Figure 102A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
102B) and for FACS analysis. 

The cp6904 protein was also identified in the 2D-PAGE experiment. 

25 These experiments show that c P 6904 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 103 

The following ^pneumoniae protein (pid 4376964) was expressed <SEQ ID 205; cp6964>: 

on _ 1 MKKXXALIGI FLVPIKGWTN KEHDAHATVL KAARAKYWLF FVQDVFPVHE 

M 51 VIEPISPDCL VHYEGWV* 

The cp6964 nucleotide sequence <SEQ ID 206> is: 

1 ATGAAAAAAT TGATTGCTTT GATAGGGATA TTTCTTGTTC CAATAAAAGG 
51 AAATACCAAT AAGGAACACG ACGCTCACGC GACTGTTTTA AAAGCGGCCA 
oir 101 GAGCAAAGTA TAATTTGTTC TTTGTTCAGG ATGTTTTCCC TGTACACGAA 

^ 151 GTTATCGAGC CTATTTCTCC CGATTGCCTG GTVACATTATG AAGGGTGGGT 

201 TTGA 

The PSORT algorithm predicts inner membrane (0.091). 

The protein was expressed in E.coli and purified as a GST-ftision product (Figure 103A) and also in 
his-tagged form. The recombinant proteins were used to immunise mice, whose sera were used in a 
40 Western blot (Figure 103B) and for FACS analysis (Figure 103C). 

These experiments show that c P 6964 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 104 

The following C.pneumoniae protein (pid 43773 87) was expressed <SEQ ID 207; cp7387>: 
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LNFAKIDHNH LYLTCLGDLG VACPILSTDC LFNYSEKASH EVLVYSKFRC 
ISGEPSRLAT SGNOTYYSIV SLPIGLRYEV TSFSGRHDFN IDMHVAPKIG 
AVLSHGTREA KEIPGSSKDY AFFSLTARES LMISEKLAMT FQVSEVIQNC 
151 YSQCTKVTKT NLKEQYRHLS HNTGFELSVK SAF* ruvs.JW.Lyw- 

The cp7387 nucleotide sequence <SEQ ID 208> is: 



l 

51 



TTGAATTTTG CAAAGATTGA TCACAATCAT CTCTACCTTA CATGTTTGGG 
AGATCTTGGT GTAGCTTGTC CTATACTTTC TACAGATTCT CTACCwS 
^ A * AGCGAGAA AGCATCTCAT GAGGTTCTTG TTTATAGTAA ATTTAGMGC 
10 ATTTCTGGAG AGCCATCTCG ACTTGCAACT TCAGGAAATG ACACATATTA 

W 201 TTCTATAGTA AGTTTACCTA TAGGACTCCG TTACGAAGTG ACTTcIcCAT 

^ ^ G ° ACGTCA TGATTTCAA * ATTGATATGC ATGTAGCTCC AAAGAtS 
3« ^™ CTCATGGAAC ACGAGAGGCT AAAGAGATCC CAGGATCTTC 
AAAAGACTAT GCATTTTTTA GCTTGACTGC TAGAGAAAGT TTAATGATTT 
15 CT GAAAAGCT TGCGATGACT TTCCAAGTTA GCGAAGTTAT TCAGMTOT 

5 451 TATTCACAAT GTACTAAAGT AACGAAAACT AATTTAAAAG AACAGTATAG 

501 GCACTTATCC CACAATACAG GGTTTGAGTT AAGCGTCAAG TCTGCATTCT 

The PSORT algorithm predicts inner membrane (0.043). 

The protein was expressed in Kcoli and purified as a his-tagged-fiision product (Figure 104A) and 
also as a GST-fusion (Figure 104B). The recombinant proteins were used to immunise mice, whose 
sera were used in a Western blot and for FACS analysis (Figure 104C; his-tagged). 

These experiments show that cp7387 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 105 

The following C.pneumoniae protein (PID 4376281) was expressed <SEQ ID 209; cp628 1>: 

*i ^ Q ^ FHP1V FSD0SLSFL P YLGKSSGIIE KCSNIVEHYL HLGGDTSVII 

51 TGVSGATFLS VDHALPISKS EKIIKILSYI IiILPLIIALF XKtVXJOlS 

101 FKYRGLILDV KKEDLKKTLT PDQENLSLPL PSPTTLKKIH ALhSwSGK 

ll\ l^tl? EQF SFTKITDLQ 2 APSPKQDIGF SYNSLLPNFY ^ vS 

251 LTFFPWIYP ™QACSFV FRSLHLPSMQ TKDKKAGFGL 

The cp6281 nucleotide sequence <SEQ ID 210> is: 

J! ™r CTTC hGTTTTTTC * TCCTATAGTC TTCTCGGATC AGTCCTTATC 

!ni ^Z~7 CCT TACCTAGGAA AAAGCTCTGG CATTATTGAA AAATGTTCCA 

"J A ^ GTTGA AG ACTAri»rA CATTTGGGAG GAGACACTTC TGTTATCATC 

\l\ A ™ AGTTT CTGGAGCTAG CTTTCTATCT GTTGATCATG CCCTCCCAAT 

201 CTCGAAATCT GAAAAAATAA TAAAAATTCT CTCCTATATT TTAATTCTTC 

Ini ^^ GATTCT AGCTCTCTTT ATTAAGATCG TTTTACGCAT TATCTTATTC 

«J a™Z ATC GTGGTCTAA * CCTAGATGTT AAGAAGGAGG ATTTGAAAAA 

lm AACACTTACA CCTGACCAAG AAAACCTCAG TCTTCCTTTA CCATCTCCTA 

401 CAACATTAAA GAAAATTCAT GCGCTACACA TTTTAGTGCG TTCTGGAAAA 

tnl ^^ CG AGCTTATACA AGAAGGGTTT TCTTTCACTA AAATCACAGA 

501 TCTTGGTCAA GCTCCTTCAC CAAAGCAAGA TATTGGCTTC TCTTATAATT 

551 CCCTTCTCCC TAACTTCTAT TTTCATTCCT TGGTATCTGT TCCAAATATT 

601 TCAGGCGAGG AACGGGCTCT TAATTATCAT AAAGAACAAC AAGAGGAAAT 

651 GGCTGTTAAA TTAAAAACAA TGCAAGCGTG TTCTTTTGTC TTCCGATCCC 

701 TGCATTTACC TTCAATGCAA ACGAAGGACA AAAAGGCTGG ATTTGGACTA 

751 CTGACGTTTT TCCCTTGGAA AATCTACCCC CTATAA -IGC»ACTA 

The PSORT algorithm predicts inner membrane (0.5373). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 105A) The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
105B) and for FACS analysis. ° 
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These experiments show that c P 6281 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 106 and 
Example 107 

5 The following Cpneumoniae protein (pid 43 7 63 0 6) was expressed <SEQ ID 2 1 1 ; cp6306>: 

1 MGNHETYIHP GVLPSSHAQD VSRSTVYPSR SFIMRRMLMG WNFKTRVPSKS 
51 SEQLMDGHRI PLIFFGKHHP TISILNVNRF SWLSIFYNGE RGF* 

The cp6306 nucleotide sequence <SEQ ID 212> is: 

10 1 ATGGGAAACC ATGAGACCTA TATACATCCA GGAGTGCTCC CGAGTAGTCA 

1U 51 TGCTCAGGAT GTTAGCAGAT CTACAGTTTA CCCCAGTCGA AGTTTTATCA 

101 TGAGACGTAT GCTCATGGGC TGGAATTTCA ATCGTGTTCC CTCGAAGAGC 

151 TCCGAGCAGT TAATGGATGG TCATCGCATA CCTCTTATAT TTTTTGGGAA 

201 GCATCATCCT ACTATATCTA TTTTAAATGT CAATAGATTT TCTTGGCTCT 

251 CCATTTTTTA CAATGGAGAA AGGGGGTTTT GA 

1 5 The PSORT algorithm predicts cytoplasm (0, 1 67). 

The following Cpneumoniae protein (pid 4376434) was also expressed <SEQ ID 213; cp6434>: 

1 MSESINRSIH LEASTPFFIK LTNLCESRLV KITSLVISLL ALVGAGVTLV 
51 VTjFVAGILPL LPVLILEIIL ITVLVLLFCL VLEPYLIEKP SKIKELPKVD 
101 ELSWETDST L* 

20 The cp6434 nucleotide sequence <SEQ ID 214> is: 

1 ATGTCTGAAA gtattaacag aagcattcat ttagaagcct ctacaccatt 

51 TTTTATAAAA TTAACGAATC TCTGTGAAAG TAGATTAGTT AAGATCACTT 

101 CTCTTGTTAT TTCTCTATTA GCTTTAGTGG GTGCGGGAGT CACTCTTGTG 

151 GTTTTATTTG TAGCTGGGAT CCTTCCTTTA CTTCCTGTAC TCATCTTAGA 

ZD 201 AATTATTTTA ATAACCGTCC TTGTCTTGCT TTTTTGTTTG GTATTGGAAC 

251 CTTATTTAAT AGAAAAACCT AGTAAAATAA AGGAACTACC TAAAGTAGAC 

3 01 GAGCTATCTG TAGTAGAAAC GGACAGTACT CTTTAA 

The PSORT algorithm predicts inner membrane (0.6859). 

The proteins were expressed in Exoli and purified as his-tag products (Figure 106A; 6306 = lanes 
30 2-4; 6434 = lanes 8-10). The recombinant proteins were used to immunise mice, whose sera were 
used in Western blots (Figures 106B & 107) and for FACS analysis. 

These experiments show that cp6306 & cp6434 are surface-exposed and immunoaccessible proteins, 
and that they are useful immunogens. These properties are not evident from the sequences alone. 

Example 108 

35 The following Cpneumoniae protein (pid 4 3 7 7 4 o 0) was expressed <SEQ ID 21 5 ; cp74O0>: 

1 MRVMRFFCLF FLGFL G SFHC VAEDKGVDLF GVWDDNQITE CDDSYMTEGR 
51 EEVEKWDA 



40 



The cp74O0 nucleotide sequence <SEQ ID 21 6> is: 

1 GTGAGAGTTA TGAGATTTTT TTGTCTATTT TTTCTTGGGT TCCTAGGATC 

51 TTTTCATTGT GTTGCTGAAG ACAAGGGCGT GGATTTATTT GGAGTCTGGG 

101 ACGATAACCA AATTACAGAG TGTGACGATA GTTACATGAC AGAGGGTCGT 

151 GAAGAGGTTG AAAAGGTAGT GGACGCTTAG 

The PSORT algorithm predicts periplasmic space (0.924). 
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The protein was expressed in E.coli and purified as a GST-fusion product (Figure 108A) The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
108B) and for FACS analysis. 

These experiments show that c P 7400 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 
Example 109 

The following C.pneianoniae protein (PID 43 76395) was expressed <SEQ ID 217; cp6395>: 

™™ SSFV ™ GPSW ^KT SVAQEVFKKH GKGIQVLLST SVMLF IGLGV 
CAFIFPQYLI VFVLTIALLM LAISLVLFLL IRSVRSSMVD RL^SEKGYA 
LHQHENGPFL DVKRVQQILL RSPYIKVRAL WPSGDIPEDP SoISSsp 
WTFFSSVDVK ALLPSPQEKE GKYTDPVLPK LSRIERVSLL 
LNEQGVNPLM NNEEFLFFIN KKAREHGIQD IiKHEIMSSLE KTGVfSpsM 

frSfT 3 V""™"*- ttselrcfhij lscfkgdwh ciIsfSd 

LADSDFLEAC KNVF!wm?PTq i^ 1TT ,nm _ 



51 
101 
151 
201 
251 



301 LADSDFLEAC KNVEWGEFIS ACEKALLKNP QGISIKDLKQ FLVR 

15 The cp6395 nucleotide sequence <SEQ ID 218> is: 



i 

51 
101 

201 
251 



501 
551 
601 

30 701 
751 



™S™ CTATGTCATG atcgtttgtg tataatgggc cttcgtggat 

TTTAAAAACG TCAGTAGCTC AGGAGGTATT TAAAAAGCAC GGTAAGGGGA 
TTCAGGTTCT CTTAAGTACT TCAGTGATGC TTTTTATAGG S^Sc 
TGTGCCTTTA TATTTCCTCA ATATCTGATT GTTTTTGTTT TGACTATAGC 
TTTGCTTATG CTCGCTATAA GCTTGGTATT GTTTCTCTTA ArACGTTCTG 
101 r^Zn7 C AATCOT *« a ' CGTTTGTGGT GTTCTGAAAA AGGATATGCT 
fTTCATCAAC ATGAGAACGG GCCTTTTTTG GATGTGAAGC GTGTACaScI 
351 AATTCTTCTA AGATCACCCT ATATTAAAGT TCGGGCTTTA TGGCCGTC^G 
25 GAGATAT CCC TGAGGATCCT TCACAAGCTG CGGTTCTATT ACTTTCTCCT 

**»CTTTCT TTTCATCCGT GGATGTAGAG GCTTTATTAC CGAGTCCTcI 
501 AGAAAAGGAG GGTAAGTATA TAGATCCTGT GCTGCCTAAG TTCTCTAGGA 
TAGAGAGAGT CTCACTTTTA GTGTTTTTGA GTGCATTTAC TTTGGATGAC 
TTAAACGAAC AGGGAGTCAA TCCTTTGATG AATAATGAGG AATTTTTATT 
TTTTATAAAT AAGAAAGCGC GTGAGCATGG GATTCAGGAT TTAAAACACG 
AGATTATGTC TTCGTTAGAG AAAACAGGAG TGCCATTAgI £££££££ 
«m AG ^ TTTCAAG TTTCACAAGC GATGTTTTCT GTATATCGCT ACTTGAGACA 
801 AAGGGATTTA ACGACTTCAG AATTAAGATG TTTTCACCTC TTAAGTTGTT 
851 TTAAAGGGGA TGTGGTTCAT TGTTTAGCTT CATTTGAAAA CCCTAaIgM 

*5 I 01 ttagcagatt ctgacttttt agaagcttgt aagaacgtgg amgg^gI 

^ GTTTATTTCG gcatgtgaga aggctctttt aaagaatccg caaggaattt 
iooi ccattaagga tctaaaacaa tttttagtga ggtaa c-^ggaattt 

The PSORT algorithm predicts inner membrane (0.6307). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 109A) The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
40 109B) and for FACS analysis. 

These experiments show that cp6395 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 
Example 110 

The following Cpneumoniae protein (PID 437 639 6) was expressed <SEQ ID 219; c P 6396>: 

MIEFAFVPHT SVTADRIEDR MACRMNKLST LAITSLCVLI SSVCIMTGTT 
CISGTVGTYA FWGUFSVL ALVACVFFLY FFYFSSeSk CaSf^ 
PIPAWSALR SYEYISQDAI KDVIKDTMQL STiWdpe SfLEFPWN 

kllkdhfdIjK =™ ESSS 



1 

51 
101 
151 
201 



KT.LKnH^nT.TC rZ^7ZZ„ XiJJJWa - Llww; J. KX L P WLK DPNITPDDFW 
KLUCDHEDLK DFKKRIATWI RKAYPEIRLP KKHCTjDK S X Y KGCCKFLLLS 
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Jo" =5 S22ES S? LGSE VPMVL ™ 

The cp6396 nucleotide sequence <SEQ ID 220> is: 



1 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 



ATGATCGAGT TTGCTTTTGT TCCTCATACC Trrr^a^r ,™ 

SSSSS? ATGGCCTCTC «*rSK£ SSSS SSSSS 

CAAGTCTTTG TGTATTGATC AGTTCAGTTT GTATTATGAT TGGTa™^ 
!^™ TG GAACGGO^G GACCTATGCA TtSS SSS 
TTCTGTGCTT GCTTTGGTAG CATGTGTTTT CTTTCTTTAT SpSS 
^™° A GGAM ™° TGTGCTTCTT S 
CCTATACCAG CTGTGGTTTC TGCATTGCGT TCCTATGAAT 
fJf^lt rC TAAAAGATAC GATGCAGTTG 

CTTCTCTTTT AGATCCCGAA GCTTTTTTCT TAGAATTTCC T^™*™ 
TCTTTGATAG TGAATCATTC GATGAAGGAA GcScgS 

ATGGTTraaa ttgg^GgL" SSSS 

601 ^Z^i atggttgaaa gatcctaata tcactcctga tgatttctgg 
601 aagctattaa aagaccattt cgatttaaag GACTTTAAGA mmSST 

651 CACTTGGATA CGGAAGGCCT ATCCAGAAAT TAGATTACCG AaS™ 

751 rr™ TAA GT <™™T AAGGGG^T SS ATTACTTTCT 

80^ T?™^ ^^TCA GAGGTTATTA CATAAGGTCT GTTATTTCTC 

85°1 ST GGGAAGTGAA aJSSSS 

qni A***^ CCCTAAC^TT CCCAAGGATC TTACCTGGGA GATGTTTATC: 

9 9 5°i ssss sees* agagaggggc attg= 

The PSORT algorithm predicts inner membrane (0.6095). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 110A) The 

uZ T! Pr0tdn W " US6d t0 immUniSe WhOSC Sera Were Used in a Western blot (Figure 
1 10B) and for FACS analysis. ^ 

These experiments show that cp6396 is a surface-exposed and immunoaccessible protein, and that it 
is a useful mimunogen. These properties are not evident from the sequence alone. 
Example 111 

The following C.pneumoniae protein (pid 4376408) was expressed <SEQ ID 221; cp6408>: 

t™ GSFL ^HLKKTRE SLKEGSISLD QLMQ IEDIAI 
GLSFITDGEF RRATWHYDFM WGFHGVGHHR ATBGVFFDTF 
2?™™ DKISVSHHPF VDHFKFVKAL EDEFTtS ^pS™ 
GGL™x^c f^ F ™ h IE ™GYRK VIRDLYDAGC SH 
™FfIsG mi °° YLL ™ NLVIADRPDD LVvElhS 

301 ^^ISISSES^SSESSSS 

351 NKLTEEEQWA KVALVKEISE EVWK* V«-<**AbCEIG 

The cp6408 nucleotide sequence <SEQ ID 222> is: 

^™^ ACTT CACTAA ^AAG ACCTCTGAAA TCTCATTTTG ATGTTGTCrrr 

sees ssss ss™ is-™ ™= 
=s jssss = sss EsS 

tcaccacaga gctacagaag gagttttct? tSSggS 
r™o ATGA tcgatgatac ctatctgaca gacaagatct SotS™ 
ttacgS T t ™ ttg ? aaaag™ SEES 

»a^»m A0TGC AAAGCAAACT CTTCCTGCAC CGGCACAGTT tttas^J 

tcS^^ ctaataatat agaggtcaca cgtS I^cc^S 
a™ AGCTA ATTGAAGAta ttgttgcagg ttatcgtaaa gtcattcgcg 

GGAGGT^n TGCTGGCTGC CGCTATCTCC AATTAGiSol 
^Sj? ™ G ^ CCTCG ^CTGTTCG TGGTAK3GTA TCGATGAAAA 
A ^n^ ^ GATCTGAT TC AACAATATCT TCTGATTAAT AATCTTG^VV 
TTGCAGATCG TCCCGATGAT CTAGTCGTTA ATTTACATGT 



1 

51 
101 
151 
201 
251 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
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751 AACTACCACT CAAAATTCTT TGCTAGTGGT AGTTATGACT TTATTTPAaa 

801 GCCCCTATTC GAACAAACAA ATGTAGACGG CTACTATTTA GAGTTTGATC 

851 ATGAGCGTTC TGGAGACTTC TCTCCTCTCA CCTTCATTTC TGGAGAAAaI 

901 ACTGTCTGCT TAGGTCTTGT TACCAGCAAA ACCCCTACAC TTGAAAATAA 

1001 S™° ATTGCTCG ^ TACATCAAGC AGCAGACTAC CTGCCCTTGG 

inV ^^ CTCTC TCTAAG TCCA CAGTGTGGTT TTGCTTCATG TGAAATAGGA 

1051 AATAAATTAA CAGAAGAAGA GCAATGGGCT AAAGTTGCTC TAGTAAAAGA 

1101 AATTTCCGAA GAAGTTTGGA AATAA ^UiAAAAGA 

The PSORT algorithm predicts cytoplasm (0.2171). 

10 The protein was expressed in Exoli and purified as a GST-iusion product (Figure 111A) and also as 
a his-tagged product. The his-tag protein was used to immunise mice, whose sera were used in a 
Western blot (Figure 1 1 IB) and for FACS analysis. 

These experiments show that c P 6408 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 
15 Example 112 

The following C.pneumoniae protein (PID 43 7 64 3 0) was expressed <SEQ ID 223; cp6430>: 

1 MKLYSISSDV DTPWIFQLMS KVDSYLFLGG NRIKWSIVM QEPNL I IGKV 

51 ENVRISTIVK ILKILSFLIF PLILIALALH YFLHAKYANH LLVSKILE^I 

20 ,V PQYVPIPGRS GDTASHYKLT TLVPVSQKNX, QAMGSNPLEV EAALRTTKPS 

20 ll\ ^S PAKYRQ IIISSHGI ^ SLDLEQliADD INLDSVSWPT EY^Sfc 

201 SKADKKVIQN VQNLRTGTYI NSVGKRSLIK FMLQHLFIDG ITQENPEttp 

301 ^T G ™ MVKYIYSHF TPQNPTIWPQ VFFRQGPLDE DRgS^E 

301 QLQELGVRFP ICPSQGPDNF NFQGFQGIRI YWEDSYQPMK EV* 

The cp6430 nucleotide sequence <SEQ ID 224> is: 

25 * ATGAAACTTT ATAGCATCTC TTCAGATGTA GATACACCTT GGATATTTCA 

51 GCTTATGTCA AAGGTAGATT CTTATCTTTT CTTAGGCGGG AATAGAATCA 

\£ ™f TATAGT ™ caagaaccta acttaattat ^aIgS 

151 GAAAACGTTC GGATCTCCAC AATAGTGAAA ATATTAAAGA TTTTATCCTT 
30 IV C ^AA TCTT C CCTCTGATTT TAATCGCTTT AGCCCTACAC TA™Sc 

JU 251 ATGCTAAATA TGCTAATCAC TTACTTGTAT CTAAGATTTT AGAAAGAGCT 

301 CCTCAGTATG TGCCTATTCC TGGTCGTTCA GGAGACACGG CGTCTcS 
351 TAAATTAACA ACATTGGTTC CAGTATCCCA AAAAAATCTA CAaScTATGG 

401 GATCAAATCC TCTAGAAGTT GAAGCGGCTC TTCGAACTAC AAAACCCTCT 

35 tV TTTTTCTGTG TACCTGCAAA ATACCGTCAG ATTATAATTT CAAGTCACGG 

35 "I CATTCGCTTT TCTTTAGATC TTGAACAACT TGCTGATGAC AOTaI™ 

551 ATTCGGTTTC CTGGCCTACG GAGTATCTTA ACTCTACTAT GGAM™ 

601 AGCAAGGCAG ATAAACGTGT TATACAGAAT GTACAAAATC TGCGGACAGG 

651 AACTTACATA AATTCTGTAG GAAAGCGTAG CCTTTTAAAA TTCATGotS 

40 IV AGCA CCTATT TATTGATGGG ATCACACAAG AAAACCCTGA AGCCCTTCCT 

ani ^T ACM CTGGAAGAC * GACTCTATTC CCTAGTGTTC GTTATMCtI 

801 TTCTCATTTT AOTCCACAAA ATCCTACAAT ATGGCCGCAA GTCTTTTTCA 

851 GACAAGGTCC TCTAGATGAA GATCGAGGAG GAGGATTTGA GATCTTAGAG 

901 CAATTACAAG AGTTAGGAGT TAGGTTTCCA ATTTGCCCCT CTCaIgGACC 

AK ,1V: AGACAA TCCT AATTTTCAAG GTTTTCAAGG GATTCGTATC TATTGGGAAG 

45 1001 ATTCCTATCA ACCCAATAAG GAGGTTTAA TATTGGGAAG 

The PSORT algorithm predicts inner membrane (0.5140). 

The protein was expressed in Rcoli and purified as a GST-fusion product (Figure 112A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
1 12B) and for FACS analysis. 

50 These experiments show that cp6430 is a surface-exposed and immunoaccessible protein, and that it 
is a useful mununogen. These properties are not evident from the sequence alone. 
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Example 113 



The following ^pneumoniae protein (PID 4376439) was expressed <SEQID 225; cp6439> : 

1 MSYDTLFKNL EKEDSVHKIC NE I FALVPRL NTIACTEAII KNLPKADIHV 

51 HLPGTITPQL AWILGVKNGF LKWSYNSWTN HRLLSPKNPH KQYSNIFRNF 

101 QDICHEKDPD LSVLQYNILN YDFNSFDRVM ATVQGHRFPP GGIQNEEDLL 

151 LIFNNYLQQC LDDTIVYTEV QQNIRLAHVL YPSLPEKHAR MKFYQ I LYRA 

2 01 SQTFSKHGIT LRFLNCFNKT FAPQ1NTQEP AQEAVQWLQE VDSTFPGLFV 

2 51 GIQSAGSESA PGACPKRLAS GYRNAYDSGF GCEAHAGEGI ETRTIFSSAK 

3 01 VNPEGIiIElT RVTFSSLKRK QPSSLPIRVT CQLG* 

10 The cp6439 nucleotide sequence <SEQ ED 226> is: 

1 ATGTCTTATG ATACGTTATT CAAGAATCTT GAAAAGGAAG ATTCTGTACA 

51 TAAGATATGC AATGAGATCT TTGCATTAGT ACCACGACTC AATACAATCG 

101 CTTGCACCGA AGCTATCATC AAAAACCTCC CCAAAGCAGA TATCCATGTA 

it 151 CACCTTCCTG GGACCATAAC ACCTCAATTA GCTTGGATTT TAGGTGTGAA 

lJ 201 AAATGGGTTC TTAAAATGGT CTTATAATTC TTGGACCAAT CATCGATTAC 

251 TTTCTCCTAA GAATCCTCAT AAACAATACT CCAATATTTT C CGAAACTTT 

3 01 CAAGATATCT GTCACGAAAA GGATCCGGAT TTAAGTGTAT TACAATATAA 

351 TATCTTAAAT TACGATTTTA ATAGCTTTGA TAGAGTGATG GCTACAGTAC 

9n f? 1 AAGGACATCG CTTTCCTCCT GGAGGAATCC AAAATGAAGA AGACCTTCTT 

ZV 451 CTCATTTTCA ATAACTATCT CCAGCAATGT CTGGACGATA CTATCGTGTA 

501 TACTGAAGTA CAACAAAATA TCCGCCTTGC CCATGTTTTG TATCCTTCAT 

551 TACCTGAAAA GCACGCGCGT ATGAAGTTTT ATCAAATCTT GTATCGTGCT 

601 TCGCAAACGT TTTCAAAACA CGGGATTACT TTACGATTTT TAAACTGCTT 
CAATAAAACA TTTGCTCCAC AAATAAACAC ACAAGAACCT GCCCAAGAAG 

^ J 701 CTGTTCAATG GCTCCAAGAG GTTGATTCTA CATTTCCTGG TCTATTTGTA 

751 GGGATACAAT CCGCAGGATC AGAATCTGCG CCCGGAGCCT GTCCTAAGCG 

801 ATTAGCTTCT GGATATAGAA ATGCTTATGA CTCAGGGTTT GGTTGTGAAG 

851 CTCATGCTGG AGAAGGCATA GAGACCCGGA CTATTTTTTC GTCAGCTAAG 

o n 901 GTAAATCCAG AGGGATTGAT CGAGATAACC CGAGTGACTT TCTCGTCTCT 

° U 951 TAAACGAAAA CAGCCATCTA GTTTACCCAT AAGAGTTACT TGCCAGTTAG 

1001 GATAA ^ 

The PSORT algorithm predicts cytoplasm (0.1628). 

The protein was expressed in Rcoli and purified as a GST-fusion product (Figure 113A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
35 1 13B) and for FACS analysis. 

These experiments show that cp6439 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 114 



The following C.pneumoniae protein (PID 43 7 644 0) was expressed <SEQ ID 227; cp6440>: 

40 * ^QSARRHLNT IFILDFGSQY TYVLAKQVRK LFVYCEVLPW WISVQCLKER 

51 APLGIIXjSGG PHSVYENKAP HLDPEIYKLG IPILAICYGM QLMARDFGGT 

101 VSPGVGEFGY tpihlypcel fkhivdcesl dteikmshrd hvttipegfn 

Ibl VIASTSQCSI SGIEKTTKQRI* YGLQFHPEVS DSTPTGNKIL ETFVQEICSA 

A c ZZ* P ^ L WNPLYIQ QDLVSKIQDT VIEVFDEVAQ SLDVQWLAQG TIYSDVIESS 

* D ZZ} RSGHASEVTK SHHNVGGLPK NLKLKLVEPL RYLFKDEVRI LGEALGLSSY 

3 01 LLDRHPFPGP GLTIRVIGEI LPEYLAILRR ADLIFIEELR KAKLYDK1SQ 

Ann AFALFLPIKS VSVKGDCRSY GYTIALRAVE STDFMTGRWA YLPCDVLSSC 
SSRIINEIPE VSRWYDISD KP PAT I EWE* 

The cp6440 nucleotide sequence <SEQ ID 228> is: 



50 1 CA ^GGAGACA TTTGAACACC ATATTTATTC TAGATTTTGG 

151 GCGCCTTTGG GGATCATTCT CTCAGGAGGT CCTCACTCTG TCTATGAAAA 



bl ATCTCAATAT ACTTATGTAT TAGCAAAGCA AGTGCGGAAG TTATTTGTAT 
101 ATTGCGAAGT TCTTCCCTGG AATATCTCTG TGCAATGTTO? AAAAGAAAGA 
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201 CAAGGCTCCA CATTTAGATC CTGAAATCTA TAAACTTGGC ATTCCAATTC 
.51 TAGCTATTTG CTATGGCATG CAGCTTATGG CTAGAGATTT TGGAGGGACT 
301 GTAAGCCCTG GTGTAGGAGA ATTTGGATAT ACGCCCATCC ATCTGTATCC 
am ^^ AGCTC TTCAAACACA TCGTCGACTG CGAATCTCTA GACACAGAGA 
401 TTCGGATGAG CCATCGGGAT CATGTTACGA CAATTCCTGA AGGATTTAAT 
451 GTAATCGCAT CCACCTCACA ATGCTCGATC TCAGGAATAG AAAATACCAA 
501 ACAACGGTTG TACGGGCTGC AATTTCATCC CGAGGTTTCT GACTCCACTC 
551 CAACGGGAAA TAAGATTCTA GAAACTTTTG TTCAAGAGAT CTGTTCTGCT 
601 CCCACACTAT GGAATCCCTT GTATATTCAG CAAGACCTTG TAAGTAAAAT 



30 



ACAACGGTTG TACGGGCTGC AATTTCATCC CGAGGTTTCT GACTCCACTC 
CAACGGGAAA TAAGATTCTA GAAACTTTTG TTCAAGAGAT CTGTTCTGCT 
CCCACACTAT GGAATCCCTT GTATATTCAG CAAGACCTTG TAAGTAAAAT 
TCAAGATACC GTTATTGAAG TATTTGATGA AGTCGCTCAG TCATTAGACG 
701 TACAATGGTT AGCTCAAGGA AC C ATCTACT CAGATGTTAT TGAGTCCTCA 
751 CGCTCTGGAC ATGCCTCCGA AGTAATAAAA TCACATCATA ATGTAGGGGG 
801 GCTTCCAAAA AATCTTAAGC TGAAGTTAGT CGAGCCCTTA CGTTATTTAT 
k 851 TTAAAGATGA AGTTCGAATT TTAGGAGAAG CCCTAGGACT TTCTAGCTAT 

iD 901 CTCTTGGACA GGCATCCTTT TCCTGGACCT GGCTTGACAA TTCGTGTGAT 

951 TGGAGAGATC CTTCCTGAAT ATCTAGCCAT TTTACGACGG GCGGACCTCA 
1001 TCTTTATAGA AGAGCTTAGG AAAGCAAAAC TCTACGATAA AATAAGCCAA 
v m ^ CTTTGCTC TATTTCTTCC TATAAAATCA GTATCTGTAA AAGGAGATTG 
on 1101 TA GAAGCTAT GGTTATACCA TAGCATTACG TGCTGTAGAA TCTACAGATT 

U 1151 TCATGACAGG ACGATGGGCC TACCTTCCAT GCGATGTTCT CAGTTCTTGr 



1251 TATTTCTGAC AAGCCACCAG CAACTATAGA ATGGGAATAG 

The PSORT algorithm predicts cytoplasm (0.0481). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 1 14A) and also as 
25 a his-tagged product. The recombinant proteins were used to immunise mice, whose sera were used 
in a Western blot (Figure 1 14B) and for FACS analysis. 

These experiments show that cp6440 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogem These properties are not evident from the sequence alone. 



Example 115 



The following C.pneumoniae protein (PID 4376475) was expressed <SEQ ID 229; cp6475>: 



l 

51 
101 



51 

40 ioi 

151 



MNTYTF S PTL QKSFSLFLLE KLDSYFFPGG TRTQILVITP TNI RliAAKKR 
GCKVSTIEKI IKILSF1LLP LVIIAFILRY FLHKKFDKQF LCIPKV1SNE 
1*1 ?^ GSRPQ AVEKAVREI S PAFFSXPRKY QLIRIDTPKD DAPSJLFPIG 
a S HI IEIIL KDLCI DTLKQSNJLFL KREMDFLGHP EEKALFDSIC SIEKDQEWMS 

^ 201 IjESKKLIjITH flkylfvsgi EQLNPGFNPE NGRGYFSEIS TAKIHFHQHG 

251 RYGPXRSSGP IMKEI* ^iiuay^ 

The cp6475 nucleotide sequence <SEQ ID 230> is: 

1 ATGAATACCT ATACC TTCTC TCCTACACTT CAGAAAAGCT TCAGCCTATT 

1 TCTTTTAGAA AAATTAGACT CTTACTTTTT CTTTGGAGGG ACTCGTACAC 
AAATCTTAGT CATCACACCA ACCAATATTA GATTAGCAGC TAAAAAAAGA 
GGGTGTAAGG TTTCTACTAT AGAAAAGATA ATCAAGATCC TCTCTTTTAT 

201 CCTGCTGCCC CTAGTT AT C A TTGCCTTTAT ACTTCGC TAT TTCTTACATA 

251 AGAAATTCGA TAAACAGTTC TTGTGTATCC CAAAAGTCAT TTCTAACGAA 

4 c l° c } GAC GAAGCTC TTCTTGGATC TAGAC C ACAA GCAGTTGAAA AAGCAGTTCG 

40 351 AGAAATATCT CCAGCCTTCT TCTCTATACC AAGAAAATAC CAACTTATTA 

401 GAATCGACAC TCCTAAAGAT GACGCTCCCT CAATCCTTTT CCCTATAGGC 

451 ATAGAGATCA TTCTCAAAGA TTTATGTATT GATACACTCA AGCAATCTAA 

501 TCTTTTCCTT AAAAGAGAAA TGGATTTCTT AGGTCATCCA GAAGAAAAAG 

c n 1^ CATTATTCGA CTCGATATGT TCTATAGAAA AAGATCAAGA ATGGATGAGC 

3U 601 TOGGAAAGTA AAAAACTTTT AATCACGCAC TTCCTAAAGT ATCTCTTTGT 

651 CTCTGGAATC GAACAACTAA ATCCAGGCTT TAACCCAGAG AATGGGCGTG 

701 GGTATTTTTC AGAAATAAGT ACAGCAAAGA TCCATTTTCA TCAGCACGGT 

751 CGATATGGGC CAATCCGTTC TTCGGGACCC ATCATGAAGG AAATATAA 

The PSORT algorithm predicts inner membrane (0.5373). 
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The protein was expressed in E.coli and purified as a GST-fusion product (Figure 115A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
1 15B) and for FACS analysis. 

These experiments show that cp6475 is a surface-exposed and immunoaccessible protein, and that it 
5 is a useful immunogen. These properties are not evident from the sequence alone. 

Example 116 

The following Cpneumoniae protein (PID 4376482) was expressed <SEQ ID 231; cp6482>: 

1 MLVELEALKR EFAHLKDQKP TSDQEITSLY QCLDHLEFVL LGLGQDKFLK 

in 51 ATEDEDVLFE SQKAIDAWNA LLiTKARDVLG LGDIGAIYQT IEFLGAYLSK 

1U 101 VNRRAFCIAS EIHFLKTAIR DXjNAYYLIjDF RWPLCKIEEF VDWGNDCVEI 

151 AKRKLCTFEK ETKELNESLL REEHAMEKCS IQDLQRKLSD IIIELHDVSL 

2 01 FCFSKTPSQE EYQKDCLYQS RLRYLLLLYE YTLLCKTSTD FQEQARAKEE 
251 FIREKFSLIiE LEKGIXQTKE LEFAIAKSKL ERGCLVMRKY EAAAKHSLDS 

3 01 MFEEETVKSP RKDTE* 

15 The cp6482 nucleotide sequence <SEQ ID 232> is: 

1 ATGCTAGTAG AGTTAGAGGC TCTTAAAAGA GAGTTTGOGC ATTTAAAAGA 

51 CCAGAAGCCG AC AAGTGAC C AAGAGATCAC TTCACTTTAT CAATGTTTGG 

101 ATCATCTTGA ATTCGTTTTA CTCGGGCTGG GCCAGGACAA ATTTTTAAAG 

oa 151 GCTACGGAAG ATGAAGATGT GCTTTTTGAG TCTCAAAAAG CAATCGATGC 

ZU 201 GTGGAATGCT TTATTGACAA AAGCCAGAGA TGTTTTAGGT CTTGGGGACA 

251 TAGGTGCTAT CTATCAGACT ATAGAATTCT TGGGTGCCTA TTTATCAAAA 

301 GTGAATCGGA GGGCTTTTTG TATTGCTTCG GAGATACATT TTCTAAAAAC 

351 AGCAATCCGA GATTTGAATG CATATTACCT GTTAGATTTT AGATGGCCTC 

0 e 401 TTTGCAAGAT AGAAGAGTTT GTGGATTGGG GGAATGATTG TGTTGAAATA 

Z2> 451 GCAAAGAGGA AGCTATGCAC TTTTGAAAAA GAAACCAAGG AGCTCAATGA 

501 GAGCCTTCTT AGAGAGGAGC ATGCGATGGA GAAATGCTCG ATTCAAGATC 

551 TGCAAAGGAA ACTTAGCGAC ATTATTATTG AATTGCATGA TGTTTCTCTT 

601 TTTTGTTTTT CTAAGACTCC CAGTCAAGAG GAGTATCAAA AGGATTGTTT 

« n 651 GTATCAATCA CGATTGAGGT ACTTATTGTT GCTGTATGAG TATACATOGT 

JU 701 TATGTAAGAC ATCCACAGAT TTTCAAGAGC AGGCTAGGGC TAAAGAGGAG 

751 TTCATTAGGG AGAAATTCAG CCTTC TAGAG CTCGAAAAGG GAATAAAACA 

801 AACTAAAGAG CTTGAGTTTG CAATTGCTAA AAGTAAGTTA GAACGGGGCT 

851 GTTTAGTTAT GAGGAAGTAT GAAGC TGCCG CTAAACATAG TTTAGATTCT 

901 ATGTTCGAAG AAGAAACTGT GAAGTCGCCG CGGAAAGACA CAGAATAA 

35 The PSORT algorithm predicts cytoplasm (0.4607). 

The protein was expressed in Rcoli and purified as a GST-fusion product (Figure 116A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
1 16B) and for FACS analysis. 

These experiments show that cp6482 is a surface-exposed and immunoaccessible protein, and that it 
40 is a useful immunogen. These properties are not evident from the sequence alone. 



Example 117 



The following Cpneumoniae protein (Pid 4376486) was expressed <SEQ ID 233; cp6486>: 

1 WWALFILG IFFLSGSLAF LVHTS CGVLL GAALPILCIG LVLLAVALIV 

51 FLCHRHKTRQ DLDYYDQDLD SXA7IHKKEIP HDISELRVTF EKLQEILFQFH 

101 TKDFSDLSQE LQGKFlNCME KWLTLEDEVT KFLXVRDRFIj ETRRNFTTFG 

151 EQVKGIQSNX FDLHEEKSSL YIjELYRLRKD LQVLLNFFLL PFGILKVDYD 

201 eieaikglfx rltsrldkld vkaqerkkfi nemsrefkev ekafdivdra 

251 TKKLMDRAKK ESPARLFJKGR TESLLEMKKN EEALKNQGLD PENLSHPELF 

301 SPYQQLLILW YLNSEIVLHH YEFLISGTVT SGLTLEECEN RMRAASTGLH 
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351 ALbVRKLQFR GAIKSAYFEK LTEIEKELRS LQDVIKSLEL ELIHKIKDIV 
401 TEET* 

The cp6486 nucleotide sequence <SEQ ID 234> is: 

1 GTGGTGGTTG TCGCTTTATT TATCCTTGGG ATTTTCTTTT TATCTGGTTC 

51 TCTTGCATTC CTTGTTCATA CGTCTTGCGG AGTTCTTTTA GGAGCGGCGC 

101 TTCCCATACT TTGCATAGGT CTTGTTTTAT TGGCTGTAGC TCTTATTGTT 

151 TTCTTATGTC ACAAACACAA GACTCGTCAA GATTTAGATT ATTATGATCA 

201 AGATTTAGAT TCTTTGGTGA TTCATAAGAA AGAGATCCCC AATGACATCT 

251 CTGAGTTGCG GGTAACATTT GAAAAGTTGC AAAATCTGTT TC AGTTC CAT 

3 01 ACGAAAGATT TCTCTGATCT AAGCCAAGAG CTTCAGGGTA AATTTATCAA 

351 TTGCATGGAG AAATGGCTAA CTTTAGAAGA CGAAGTGACT AAATTTCTTA 

401 TTGTTCGAGA TAGATTTTTA GAAACCAGAA GAAATTTTAC CACTTTTGGA 

451 GAACAGGTTA AAGGGATCCA AAGCAATATT TTTGATTTGC ATGAGGAAAA 

501 GTCTTCATTA TATTTAGAAT TGTATAGGCT TAGGAAAGAC CTCCAAGTTC 

551 TATTAAATTT TTTTCTGCTC CCCCCAGGTA TACTCAAGGT AGATTATGAT 

601 GAAATTGAGG CTATCAAAGG TCTGTTTATA AGATTAACCT CTAGATTAGA 

651 TAAGCTTGAT GTGAAAGCTC AGGAACGTAA GAAGTTCATT AATGAAATGA 

701 GTAGGGAATT TAAAGAAGTA GAGAAAGCTT TTGATATTGT CGATAGGGCA 

751 ACAAAAAAGC TTATGGATAG AGCCAAGAAA GAAAGTCCGG CACGTCTTTT 

801 CATGGGTAGA ACTGAGTCTC TCTTAGAAAT GAAAAAAAAT GAAGAAGCCC 

851 TTAAAAATCA GGG- iCTAGAT CCTGAAAATC TTTCCCATCC TGAACTTTTT 

901 AGTC CGTATC AACAGCTTTT AATTTTGAAT TATTTAAATA GCGAAATAGT 

951 TCTGCATCAT TATGAGTTCC TTATTTCTGG AACAGTAACT TCTGGCCTAA 

1001 CTCTTGAAGA ATGTGAAAAT CGAATGAGGG CGGCTTCTAC TGGGTTGAAC 

1051 GCCCTTCTGG TGCGTAAGCT C C AGTTC AGA GGTGCTATAA AATCTGCGTA 

1101 TTTTGAAAAA CTCACAGAGA TTGAAAAAGA GTTACGATCA CTTCAAGACG 

1151 TAATAAAGTC ATTGGAACTA GAACTGATCC ATAAGATAAA AGATATAGTG 

1201 ACAGAAGAAA CTTAG 

The PSORT algorithm predicts inner membrane (0.7474). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 117A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
1 17B) and for FACS analysis. 

These experiments show that cp64S6 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 118 

The following ^pneumoniae protein (PID 4376526) was expressed <SEQ ID 235; cp6526>: 

1 MSPFKKIVNR LLiCYISFQKE SRTkPIIIRE PRMTTKSLGS FNSVISKNKI 

51 HFISLGCSRN LVDSEVMLGI LLKAGYESTN EIEDADYLIL NTCAFLKSAR 

101 DEAKDYLDHL IDVKKENAKI IVTGCMTSNH KDELKPWMSH IHYLLGSGDV 

151 ENILSAIESR ESGEKISAKS YIEMGEVPRQ LSTPKHYAYk KVAEGCRKRC 

201 AFCIIPSIKG KLRSKPLDQI LKEFRILVNK SVKEIILIAQ DLGDYGKDLS 

251 TDRSSQZ.ESD LHELLKEPGD YWLRMLYLYP DEVSDGIIDL MQSNPKLIiPY 

301 VDIPIjQHIND RILKQMRRTT SREQILGFLE KLRAKVPQVY IRSSVIVGFP 

351 GETQEEFQEL ADFIGEGWID NLGIFLYSQE ANTPAAELPD Q I PEKVKE SR 

401 LKILSQIQKR NVDKHNQKLI GEKIEAVIDN YHPETNLLLT ARFYGQAPEV 

451 DPC1IVNEAK LVSHFGERCF IEITGTAGYD LVGRWKKSQ NQALLKTSKA 

501 - 



5V1 

The cp6526 nucleotide sequence <SEQ ID 236> is: 

1 ATGAGTCCTT TTAAGAAAAT AGTAAAT 



ATGAGTCCTT TTAAGAAAAT AGTAAATCGC TTACTATGCT ATATTTC TTT 

51 TCAAAAAGAA TCAAGAACTC TCCCAATCAT TATTAGAGAA C CTAGGATGA 

101 CAACAAAAAG TTTAGGATCT TTCAATTCAG TTATTTCCAA AAATAAAATT 

151 CATTTTATTA GTTTGGGATG CTCTCGGAAC CTTGTAGATA GCGAAGTCAT 

201 GCTAGGCATT CTTCTTAAGG CAGGTTACGA GTCTACTAAT GAAATTGAAG 

251 ATGCTGACTA TTTAAT TTTA AATACCTGTG CGTTTTTAAA AAGTGCTAGA 

301 GATGAAGCTA AAGATTATCT AGACCATCTA ATTGATGTAA AAAAAGAGAA 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 



CGCTAAAATT 

TTAAACCCTC 

GAGAATATTC 

TGCAAAGAGT 

CAAAACACTA 
GCTTTTTGTA 

GGATCAAATT 
AGATTATATT 
ACAGACCGCA 
GCCTGGTGAT 
GTGATGGCAT 
GTAGATATTC 
AAGAACGACT 
CCAAGGTTCC 
GGTGAAACTC 
TTGGATTGAT 
CGGCAGCAGA 
TTGAAAATTC 
GAAGCTCATT 
AAACGAATCT 
GACCCTTGTA 
AAGATGCTTT 
GTGTTGTAAA 
TAG 



ATTGTAACTG 
GATGTCACAC 
TTTCTGCTAT 
TACATTGAGA 
TGCCTATTTA 
TTATTCCTTC 
CTTAAAGAAT 
GATAGCTCAA 
GTTCGCAGCT 
TATTGGCTGC 
TATAGATCTT 
CCTTACAGCA 
TCTAGGGAGC 
TCAGGTCTAT 
AGGAAGAATT 
AATCTCGGAA 
ACTCCCTGAC 
TATCTCAAAT 
GGGGAAAAAA 
TTTACTCACT 
TTATTGTAAA 
ATAGAAATCA 
AAAATCTCAG 



GATGCATGAC 
ATCCATTACC 
TGAGTCTCGT 
TGGGAGAAGT 
AAAGTTGCTG 
CATTAAAGGA 
TCCGCATCCT 
GACCTAGGAG 
AGAATCACTA 
GGATGTTGTA 
ATGCAATCTA 
CATTAACGAC 
AAATCCTAGG 
ATCCGTTCTT 
CCAGGAGTTA 
TTTTCTTGTA 
CAGATACCAG 
TCAGAAACGC 
TAGAAGCAGT 
GCAAGGTTCT 
TGAGGCGAAG 
CAGGGACTGC 
AACCAAGCTT 



TTCCAACCAC 
TACTAGGTTC 
GAATCTGGAG 
TCCAAGACAG 
AGGGCTGTAG 
AAGCTCCGCA 
TGTAAACAAG 
ATTATGGAAA 
TTACATGAGT 
TTTATATCCT 
ATCCCAAACT 
CGTATTTTAA 
ATTCCTAGAA 
CTGTTATTGT 
GCTGATTTTA 
CTCTCAAGAA 
AAAAAGTTAA 
AATGTGGATA 
TATTGATAAC 
ATGGACAAGC 
CTTGTTTCTC 
TGGTTACGAC 
TGCTAAAAAC 



AAAGATGAGC 
TGGGGATGTT 
AAAAAATCTC 
CTTTCCACAC 
AAAACGTTGT 
GCAAACCTCT 
AGTGTGAAAG 
GGATCTCTCT 
TACTGAAAGA 
GATGAAGTGA 
TCTTCCCTAT 
AGCAAATGCG 
AAATTACGTG 
GGGTTTCCCC 
TTGGTGAGGG 
GCGAATACCC 
AGAATCGAGG 
AACATAATCA 
TATCATCCTG 
TCCTGAAGTG 
ATTTTGGAGA 
CTTGTAGGGC 
TAGCAAAGCT 



The PSORT algorithm predicts cytoplasm (0.1296). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 118A) and also as 
a his-tagged product. The recombinant proteins were used to immunise mice, whose sera were used 
in a Western blot (Figure 1 18B) and for FACS analysis. 

These experiments show that cp6526 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 119 

The following C.pneumoniae protein (PID 437652 8) was expressed <SEQ ID 237; cp6528>: 



1 MKNNINNNEC YFKLDSTVDG DLIiAANLKTF 

51 FKDQVSATGL TSGTTYJSTLNA QNFTSSQISI 

101 PANYVRSPEY FFCSKPLIGD FDFNSGESYL 

151 FIGWKQSTRE IiTVGGNTAIQ FLAAGTYIVS 

201 WGLGQVQCES TlYSGGGYAT IGTLGTSIYR 

251 GIFYLSMGGS SAGIGNYSFS LLYYPDDRG* 

The cp6528 nucleotide sequence <SEQ ID 238> is: 



DTQAQGISST 
DFKNNRLSNC 
PLTGSEYTLY 
FTVGKRWGWN 
ASVDVAPNPN 



1 


ATGAAAAACA 


51 


TGTAGATGGT 


101 


CCCAAGGAAT 


151 


TTTAAAGATC 


201 


TTTAAATGCA 


251 


ATAATCGTCT 


301 


CCAGCGAATT 


351 


GATCGGAGAT 


401 


GTTCGGAATA 


451 


TTTATAGGAT 


501 


TGCGATACAA 


551 


GTAAACGGTG 


601 


AATGGTTTAG 


651 


GTATGCAACA 


701 


ATGTAGCTCC 


751 


GGTATTTTCT 


801 


CTCCTTTTCT 



ATATTAATAA 
GATTTGTTAG 
CTCATCGACT 
AAGTTTCAGC 
CAAAACTTTA 
GAGTAATTGT 
ATGTTCGTTC 
TTTGATTTTA 
TACTCTATAT 
GGAAGCAAAG 
TTTCTTGCAG 
GGGATGGAAT 
GACAAGTCCA 
ATAGGTACAC 
TAATCCTAAT 
ATCTCAGTAA 
CTTCTCTATT 



TAATGAGTGC 
CAGCCAATCT 
GAAACATTTT 
AACTGGATTA 
CTTCCTCCCA 
GCATTGCCAA 
TCCCGAATAT 
ACTCAGGGGA 
CAGTCACGTA 
TACACGAGAA 
CAGGAACCTA 
AATGGTTGGG 
ATGTGAAAGC 
TGGGGACCTC 
GATCCGAATG 
CGGTGGTTCT 
ATCCGGACGA 



TATTTTAAAT 
CAAGACCTTT 
CTGTTCAGGG 
ACTTCAGGAA 
AATCTCTATA 
AAGAAGACTG 
TTTTTCTGTT 
ATCTTATTTG 
ATGTAAATAG 
TTAACTGTAG 
TATCGTTTCA 
GAGGAGCCAT 
ACGATTTATA 
AATATATAGA 
CTTCGGATCG 
AGTGCAGGTA 
TAGAGGGTAG 



ETFSVQGNAT 
AIjPKEDCDPV 
QSRWVNSIFR 
NGWGGAIYIN 
DPKASDRYRA 



TAGACTCAAC 
GATACACAGG 
GAATGCAACA 
CTACTTATAA 
GATTTTAAAA 
CGATCCGGTG 
CCAAGCCTCT 
CCTCTGACTG 
TATATTTCGT 
GGGGAAATAC 
TTTACTGTTG 
TTATATCAAT 
GTGGTGGAGG 
GCCTCTGTAG 
CTATAGAGCG 
TAGGGAATTA 
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The PSORT algorithm predicts cytoplasm (0 J 668). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 119A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
119B) and forFACS analysis. 

These experiments show that cp6528 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 120 

The following Cpneumoniae protein (PID 43 7 6 62 7) was expressed <SEQ ID 239; cp6627>: 

1 MKCSPLTLVP HIFLKNDCEC HRSCSLKIRT I ARli I LGLVXi ALVSALSFVF 

51 LAAPISYAIG GTLALAAIVI LIITLVVALL AKSKVLPIPN ELQKI I YNRY 

101 PKEVFYFVKT HSL2VNELKI FINCWKSGTD LPPNLHKKAE AFGIDILKSI 

151 DLTLFPEFEE ILLQNCPLYW LSHFIDKTES VAGEIGLNKT QKVYGLLGPL 

201 AFHKGYTTIF HSYTRPLI/TL ISESQYKFLY SKASKNQTOS PSVKKTCEEI 

1S 251 FKELPHNMIF RKDVQGISQF LFLFFSHGIT WEQAQMIQLI NPDNWKMLCQ 

iD 301 FDKAGGHC SM ATFGGFLNTE TNMFDPVSSN YEPTVHFMTW KELKVLLEKV 

351 KE S PMHPASA LVQKICVNTT HHQNLLKRWQ FVRNTSSQWT SSLPQYAFHA 

401 QTYKLEKKIE SSL.PIRSSL* 

The cp6627 nucleotide sequence <SEQ ID 240> is: 

1 ATGAAGTGTA GTCCTTTAAC ACTAGTTCCC CATATATTTT TAAAAAATGA 

ZU 51 CTGCGAATGT CATAGATCTT GTTCTTTAAA AATTAGGACA ATTGCCCGAC 

101 TCATTCTTGG GCTTGTTCTA GCTCTTGTTA GCGCACTTTC TTTTGTTTTC 

151 CTTGCTGCGC CGATTAGCTA TGCTAOTGGA GGAACTTTAG CTTTAGCCGC 

201 TATCGTAATC TTGATTATAA CGCTAGTCGT AGCACTGCTA GCTAAATCAA 

251 AGGTTCTGCC CATCCCCAAC GAACTTCAGA AGATTATTTA CAATCGCTAT 

ZD 3 01 CCTAAAGAAG TCTTOTATTT CGTGAAAACA CACTCCCTGA CTCTTAACGA 

3 51 ATTAAAAATA TTTATTAATT GCTGGAAAAG CGGTACAGAC CTGCCTCCGA 

401 ATTTACATAA AAAAGCAGAG GCTTTCGGGA TCGATATTCT AAAATCTATA 

451 GATTOAACCC TGTTTCCAGA GTT CGAAGAG ATTCTTCTTC AAAACTGCCC 

501 GTTATACTGG CTCTCCCATT TTATAGACAA AACTGAATCT GTTGCTGGGG 

JU 551 AAATCGGATT AAATAAAACA CAAAAAGTTT ATGGTTTACT TGGGCCCTTA 

601 GCGTTTCATA AAGGATATAC AACTATTTTC CACTCTTATA CACGCCCTCT 

651 ACTAACATTA ATCTCAGAAT CACAGTATAA GTTCCTATAT AGTAAAGCGT 

701 CTAAGAATCA ATGGGATTCT CCTTCTGTGA AAAAAACCTG CGAAGAAATA 

^ 751 TTCAAGGAAC TCCCCCACAA TATGATTTTC CGGAAGGATG TTCAAGGAAT 

^ 801 CTCACAATTC TTATTTCTTT TCTTTTCTCA TGGTATCACT TGGGAACAGG 

851 CTCAGATGAT TCAACTTATA AATCCTGATA ATTGGAAAAT GTTGTGTCAG 

901 TTTGATAAAG CAGGAGGC C A CTGTTCCATG GCAACATTTG GAGGCTTTTT 

951 GAATACTGAA ACAAATATGT TCGATCCAGT ATCCTCTAAC TATGAACCTA 

A ~ 1001 CAGTGAACTT CATGACGTGG AAAGAATTGA AGGTTTTACT AGAGAAAGTA 

40 1051 AAAGAAAGTC CTATGCACCC AGCGAGTGCT CTTGTTCAGA AGATATGCGT 

1101 AAATACAACG CACCATCAAA ATCTGTTAAA ACGATGGCAA TTTGTTCGTA 

1151 ATACGAGTTC ACAATGGACA TCAAGCTTAC CTCAGTATGC TTTCCACGCC 

1201 CAAACCTACA AACTAGAGAA AAAAATAGAA AGCAGTCTCC CTATACGATC 

1251 TTCCCTATAA 

45 The PSORT algorithm predicts inner membrane (0.7198). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 120A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
120B) and for FACS analysis. 

These experiments show that cp6627 is a surface-exposed and immunoaccessible protein, and that it 
50 is a useful immunogen. These properties are not evident from the sequence alone. 
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Example 121 

The following C.pneumoniae protein (PID 4376629) was expressed <SEQ ID 241; cp6629>: 

1 MSNITSPVIQ NNRSCNYYFE LKNSTTIHIV ISAILLCGAL IAFLCVAAPV 

51 SYILSGALLG LGLLIALIGV XLGIKKITPM ISSKEQVFPQ ELVNRIRAHY 

101 PKFVSDFVSE AKPNLKDLIS FIDLLNQLHS EVGSSTNYW SEELQQKIDT 

151 FEGIAFXKNE VRTASLKRLE SAASSRPLFP SIiPKILQKVF PFFWLGEFIS 

201 AGSKWELHR VKKIGGSKBE DLSDYIKPEM LPTYWLIPLD FRPTNSSILN 

251 LHTLVLARVX) TRDVFQHLKY AALNGEWNL.W HSDLNTMKQQ LFAKYHAAYQ 

301 SYKHLSQPSL QEDEFYNLLL CIFKHRYSWK QMSLIKTVPA DLWENLCCLT 

351 IjDHTGRPQDM EFASLIGTLY TQGLIHKESE AFLSSLTLLS LDQFKTXRRQ 

401 STNIAMFLEN LATHNSTFRS LPPITVHPLK RSVFSQPEED ESSLLIG* 

The cp6629 nucleotide sequence <SEQ ID 242> is: 



1 ATGAGTAATA TAACCTCGCC AGTTATTCAA AATAATCGCT CTTGTAATTA 
1S 51 TTATTTTGAA TTAAAGAATT CAACCACTAT TCATATTGTT ATCAGTGCCA 

LD 101 TCTTACTCTG CGGAGCTTTG ATAGCTTTCT TGTGTGTAGC AGCTCCTGTT 

151 TCCTATATTC TAAGTGGCGC ATTGTTAGGA TTAGGATTAT TAATAGCCTT 

201 GATTGGTGTG ATTTTAGGAA TAAAAAAAAT CACGCCTATG ATTTCATCAA 

251 AAGAACAAGT ATTCCCCCAA GAACTCGTAA ATAGAATCAG GGCGCACTAT 

9n 301 CCTAAATTTG TCTCTGATTT TGTTTCAGAA GCTAAACCAA ATCTTAAAGA 

ZU 351 TCTCATAAGT TTTATTGATC TTC TAAATC A ATTGCACTCT GAAGTTGGAT 

401 CATCTACAAA TTACAACGTA TCTGAAGAAC TACAACAGAA AATAGATACG 

451 TTCGAGGGTA TCGCACGCTT AAAAAATGAA GTCCGTACTG CTTCTCTTAA 

501 AAGACTTGAA AGCGCTGCTT CTTCCCGTCC CCTCTTCCCC TCTTTACCAA 

0 c 551 AAATCTTACA AAAGGTATTT CCATTTTTCT GGTTAGGAGA GTTTATTTCT 

ZD 601 GCAGGCAGCA AGGTTGTAGA GCTCCATCGA GTTAAGAAAA TTGGAGGCAG 

651 CCTCGAAGAA GACCTTAGTG ATTATATAAA ACCAGAGATG CTTCCTACCT 

701 ATTGGTTGAT TCCTTTAGAT TTTAGACCAA CAAATTCCTC TATTCTAAAT 

751 CTACACACAT TAGTTTTAGC TAGAGTCTTA ACTCGTGATG TTTTTCAACA 

^ 801 TCTTAAGTAT GCAGCATTAA ATGGCGAGTG GAACCTGAAT CATAGTGATC 

^ U 851 TAAATACTAT GAAACAGCAG CTCTTTGCTA AATATCATGC GGCGTATCAA 

901 TCCTATAAAC ATCTATCTCA ACCCTCTCTT CAAGAGGATG AATTCTATAA 

951 CCTGCTCTTG TGTATTTTTA AGCATAGGTA CTCGTGGAAG CAGATGTCCT 

1001 TAATAAAAAC AGTCCCGGCT GATTTATGGG AAAACCTCTG TTGCTTGACT 

~- 1051 TTAGAC C ATA CAGGACGACC CCAAGACATG GAATTTGCCT CTCTAATTGG 

1101 TACTC TCTAC ACACAAGGCC TAATTCATAA AGAAAGCGAA GCATT TCTTT 

1151 CTTCATTGAC ACTCCTTAGT TTAGATCAGT TTAAAACGAT CCGTCGTCAG 

1201 TCAACCAATA TAGCGATGTT C CTTGAGAAT TTAGCAACTC ATAATTCCAC 

1251 CTTTAGAAGC TTACCACCTA TAACAGTCCA TC CACTC AAG AGAAGCGTCT 

13 01 TCTCCCAACC TGAAGAAGAC GAGTCCTCCC TGCTGATAGG TTAG 

40 The PSORT algorithm predicts inner membrane (0.5776). 

The protein was expressed in Kcoli and purified as a GST^fusion product (Figure 121A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
121B) and for FACS analysis. 



45 



These experiments show that cp6629 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 122 



The following Cpneumoniae protein (PID 437 6732) was expressed <SEQ ID 243; cp6732>: 

1 MEMMSPFQQP EQCHFDWGS FLRPESLTRA RSDFEEGRIV YEQMRWEDA 
5 J- AIRNliIKKQT EAGLIFFTDG EFRRYSWDFD FMWGFHGVDR RRDSNDPEIG 
3U 101 "VYLKDKSSVS KHPFIEHFEF VKTFEKGNAK AKQTIPSPSQ FFHEMIFAPN 

LKNTRKFYPT NQELIDDIVF YYRQVIQDLY AAGCRNLQLD DCAWCRLLDl 
RAPSWYGVDS HDRLQEILEQ FLWIHNLVMK DRPEDLFVSL HVC RGDYQAE 
FFSRRAYDSX EEPLFAKTDV DSYHYYWALD DKYSGGAEPL AYVSGEKHVC 
LGLISSNHSC IEDRDAWSR IYEAASYIPL ERLSLSPQCG FASCEGDHRM 



151 
201 
251 
301 
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3 51 TEEEQWKKIA FVKEIAKEIW G* 

The cp6732 nucleotide sequence <SEQ ID 244> is: 

1 ATGGAAATGA TGAGCCCATT CCAACAACCT GAGCAATGTC ATTTTGATGT 

51 TGTGGGAAGT TTCTTACGTC CTGAAAGTCT TACACGAGCA CGCTCTGATT 

101 TTGAAGAAGG AAGAATTGTC TATGAGCAGA TGCGAGTTGT CGAAGATGCT 

151 GCTATTCGTA ATCTCATAAA AAAGCAAACA GAAGCAGGTC TTATCTTTTT 

201 TACTGATGGG GAATTC CGTA GGTATAGTTG GGATTTCGAC TTTATGTGGG 

251 GATTCCATGG CGTGGATCGT CGCAGGGACT CTAATGACCC TGAAATTGGA 

301 GTGTATCTTA AAGATAAAAT CTCCGTATCA AAACATCCGT TTATAGAACA 

351 TTTCGAGTTT GTCAAAACTT TTGAGAAGGG AAATGCAAAA GCAAAACAAA 

401 CGATTCCTTC TCCATCACAA TTTTTCCATG AGATGATTTT TGCTCCTAAT 

451 CTGAAAAATA CTCGGAAGTT TTATCCTACG AATCAAGAGC TAATTGATGA 

501 TATTGTCTTT TATTATCGCC AAGTCATCCA AGATCTTTAT GCTGCAGGTT 

551 GTCGTAATTT GCAGTTGGAC GATTGTGCTT GGTGTCGCCT CTTGGATATA 

601 CGAGCGCCTT CTTGGTATGG TGTTGATTCT CATGACAGGT TGCAGGAAAT 

651 TTTAGAACAG TTTTTATGGA TCCATAATTT AGTGATGAAG GATAGACCCG 

7 01 AGGATCTTTT TGTAAGTCTG CATGTCTGTC GTGGTGATTA TCAGGCCGAG 
751 TTTTTCTCTA GACGAGCTTA TGATTCTATA GAGGAGCCTT TATTTGCTAA 

8 01 GACCGATGTG GATAGTTATC ACTATTATTG GGCTCTTGAT GATAAGTATT 
851 CAGGAGGTGC TGAGCCTTTA GCTTACGTCT CTGGAGAGAA ACACGTCTGC 
901 TTGGGATTGA TCTCCAGCAA CCATTCTTGT ATTGAAGATC GAGATGCTGT 
951 GGTTTCTCGT ATTTATGAAG CTGCGAGCTA CATTCCCTTA GAGAGACTTT 

1001 CTTTGAGCCC GCAATGTGGG TTTGCTTCTT GTGAGGGAGA CCATAGAATG 
1051 ACTGAAGAAG AACAGTGGAA GAAGATCGCC TTTGTGAAAG AGATTGCTAA 
1101 AGAGATCTGG GGATAA 

The PSORT algorithm predicts cytoplasm (0.2196). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 122A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
122B) and for FACS analysis. 

These experiments show that cp6732 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 123 



The following Cpneumoniae protein (PID 437 673 8) was expressed <SEQ ID 245; cp6738>: 

1 VWLRFLLLVS YDEKEKDWV VONHSEPNIL GLPPEAVSQL IEELSDEGYS 

51 YLNWRCDLS GETTVQQRLL LNADEGRSMT WISELPEGH PDIRNLQLAS 

101 ERIFVSREKE AADAYASGCK WAPDDEHLP WVSSHIAYAE EIREKQEQTM 

151 QGSLTEEQLG ALLCNTVSTE KNLAFALDAV IKQSVWRFRN PDLFAYEREA 

201 LEASVTDALV SYVSNLDMIP YTSSQGIVIE DSSIVRTSQE HTLIWCAAF 

251 DKLASQIEFL CPSDVLPISG KDPLISDDED EELNPKVSSA ADSKDKT* 

The cp6738 nucleotide sequence <SEQ ID 246> is: 

1 GTGTGGCTGC GCTTTTTACT TTTAGTGTCC TATGATGAGA AGGAGAAAGA 

51 CGTAGTTGTC GTTTGTAATC ATTCTGAACC TAATATCCTC GGCCTGCCTC 

101 CTGAAGCAGT CTCTCAGCTT ATTGAAGAGC TTAGCGATGA AGGCTATAGC 

151 TATCTGAATG TAGTGCGTTG TGATCTCTCC GGGGAGACTA CGGTTCAACA 

201 ACGTCTGCTA TTGAATGCCG ATGAAGGGAG ATCTATGACG GTGGTGATCT 

251 CAGAGCTTCC TGAAGGGCAC CCCGATATTC GGAATTTG C A GTTGGCATCC 

3 01 GAAAGAATTT TTGTTTCTCG TGAAAAAGAA GCTGCTGATG CCTATGCTTC 

351 AGGATGTAAA GTGGTCGCTT TCGATGATGA GCATCTCCCT TGGGTCTCCA 

401 GTCATATTGC CTACGCGGAG GAGATCAGAG AGAAACAAGA ACAAACAATG 

451 CAAGGGTCTT TAACTGAAGA GCAGTTAGGA GCACTCCTCT GCAACACAGT 

501 CTCCACAGAG AAAAATCTAG CCTTTGCTCT AGACGCCGTG ATAAAACAGT 

551 CTGTGTGGAG ATTCCGCAAT CCGGATCTTT TTGCTTATGA GAGAGAAGCT 

601 CTAGAGGCTT CAGTAACAGA TGCTTTAGTA TCTTACGTTT CAAATTTAGA 

651 CATGATACCG TACACAAGTT CTCAGGGCAT AGTCATAGAA GATAGTAGTA 

7 01 TCGTCCGTAC CTCTCAAGAG CATACACTCA TTGTGAACTG TGCAGCATTC 
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751 GATAAGTTAG CGAGCCAAAT AGAGTTCTTA TGCCCCAGTG ACGTGTTGCC 
801 CATTTCTGGT AAAGACCCTT TGATTTCTGA TGATGAGGAT GAGGAACTGA 
851 ATCCTAAAGT TTCATCTGCT GCAGACTCTA AAGATAAAAC CTAG 

The PSORT algorithm predicts cytoplasm (0.1587). 

The protein was expressed in Exoli and purified as a GST-fusion product (Figure 123A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
123B) and for FACS analysis. 

These experiments show that cp6738 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 124 

The following ^pneumoniae protein (PI D 4 3 7 6 7 3 9) was expressed <SEQ ID 247; cp6739>: 

1 MTHCLHGWFS WRFHFVQAF NFSRPLYSRI THFALGVIKA IPIVGHLVMG 

51 VDWLISHCFE RGVJHPGFPS DIAPIliKVEK IAGRDHISRI ENQLKSLRKT 

101 IEVEDLDKVH GQYQENPYAD MASSEVLKLD KGVHVSELGK AFSRVRNRIT 

151 RSYSYAPTPQ LDSIAIVGID LVSPEEQENL VRLANEVIQL YPKSKTTLYL 

2 01 LIDFNKEWVG DISSDKEKQIj RSLGLHSEVQ CLSVLEPQGA EGEDTKHFDL 

I^TnPV^vnav tn^wTT „ , „ 



251 MVGCYGKDSY tiREGKILQQA L.GTSLGTVPW VWVMHTLPSR YRSRLSLPIN 

301 TEKDKTELYK EISRTHHQLH TLGMGLGAQD SGLLLDRQRL HAPLSQGSHC 

351 HSYLADLTHE ELKILLFSAF VDAKNISKKE LREVSLNFAN PTSVECGCAF 

401 YF* 



The cp6739 nucleotide sequence <SEQ ID 248> is: 

1 ATGACTCATT GCTTACATGG TTGGTTTTCT GTAGTTCGTC ATCACTTTGT 

51 GCAGGCGTTT AATTTCTCAC GTCCTTTATA TTCTCGAATT ACCCACTTCG 

101 CTTTAGGGGT GATTAAGGCC ATCCCCATTG TAGGGCATCT TGTTATGGGA 

151 GTCGATTGGT TGATCTCTCA TTGCTTCGAG AGGGGAGTCT CACACCCTGG 

2 01 GTTCCCTTCA GATATTGCTC CTATACTGAA AG TAGAAAAG ATCGCGGGCC 
251 GAGATCATAT TTCTAGAATC GAAAATCAGC TAAAGAGCCT TAGGAAAAOT 
301 ATCGAGGTTG AAGATCTAGA TAAAGTCCAC GGGCAATATC AAGAGAATCC 

3 51 TTATGCAGAT ATGGCCTCTA GTGAGGTTCT TAAACTCGAT AAGGGAGTTC 
401 ATGTTAGCGA GCTTGGCAAA GCCTTTTCTA GAGTTCGCAA TCGC*TCACC 
451 AGATCCTATA GTTATGCCCC TACTCCTCAG TTGGACTCTA TAGCTATTGT 
501 TGGTATAGAT CTCGTCAGTC CTGAAGAACA AGAGAATTOA GTACGCTTGG 
551 CGAATGAGGT CATTCAACTC TATCCCAAAT CAAAGACAAC TCTATATCTT 
601 CTTATCGATT TTAATAAGGA GTGGGTAGGG GATATCTCCT CTGATAAGGA 
651 AAAACAGCTC CGTTCTCTAG GTCTACATTC TGAAGTTCAG TGTCTTTCCG 
701 TCTTGGAACC TCAGGGTGCC GAGGG CGAAG ATACGAAACA CTTTGACCTT 
751 ATGGTCGGCT GTTATGGGAA GGATTCTTAC TTAAGGGAGG GTAAAATTTT 
801 ACAGCAGGCC CTAGGGACTT CGTTAGGTAC TGTTCCCTGG GTGAATGTTA 
851 TGCACACATT GCCATCTAGG TATAGATCTC GGCTTTCCTT ACCTATAAAT 
901 ACCGAAAAGG ATAAGACAGA GCTTTATAAA GAGATTTCTC GTACACACCA 
951 TCAGTTGCAT ACTTTGGGAA TGGGACTTGG AGCCCAGGAT TCAGGATTGC 

1001 TCTTAGACCG GCAACGACTC CATGCTCCTT TATCTCAAGG GTCTCACTGC 

1051 CATTCCTATC TTGC AGATC T C AC C C ATGAA GAGCTGAAAA TTTTGTTATT 

1101 TTCAGCATTT GTGGATGCTA AGAACATAAG TAAGAAAGAG CTTCGTGAGG 

1151 TATCTCTAAA TTTTGCTAAC GATACTTCCG TAGAGTGTGG CTGCGCTTTT 

1201 TACTTTTAG 

The PSORT algorithm predicts inner membrane (0.2190). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 124A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
124B) and for FACS analysis. 
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These experiments show that c P 6739 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 125 

The following C.pneumoniae protein (PID 4376741) was expressed <SEQ ID 249; cp6741>: 



10 



15 



20 



25 



30 



35 



40 



45 



50 



1 MASCLSAWFS IVREHFYRAF DFSLPFCARI 

51 IEWLVSRYIjE SFVTKPTFVS DWSLLKTEK 

101 VAPEDEDKVH GKIPVHPFGG IQPVEVLTOY 

151 QAYLQAPRPK kQKIYIIGND MNPFEVDDFL 

201 YLTASGGRNA MDKKNRKLLS DCELNPKIAC 

251 HGENDQGTLN QIQEELEKSG EETPWIHVGQ 

301 DKEKALEYSE LEKEQLYSRL VYVGERSSVL 

351 PLSEGHYCHS YLADLENPGL QKTILAAFLN 

401 KTYLRQHFGF FERMSRSDRN WWVCDSWW 

451 GYSHFNIFAF RSNSMCVEER RILNESSQEK 

501 IjASEGMLCGK ECYAVWYTS GCANFMMEEV 

551 VRKQKQEAAXi DQDESEIYVC NQLTAQQNFA 

The cp6741 nucleotide sequence <SEQ ID 250> is: 



TEFVLGVIKG 
VAGRDH IARV 
PEVQDATLGL 
HIiARLCNETQ 
LDFNQGDWK 
KPLSQSLWDF 
SLGFGDSRSG 
PKELSSTILQ 
GTDWKEEPSF 
AFTMIFCEDS 
LTLERESNLW 
CS* 



IPWGHIIVG 
VETLKRQRVA 
AFSKIRNRVR 
RLYPDATXSL 
QATCDCWMVY 
SPFSSLEMKG 
ILMDPKRVHA 
PISIiNLILNS 
QHFXMELECR 
VSQGDIRCLH 
NRKHGLWKRE 



51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 



ATGGCTTCTT GTTTATCTGC CTGGTTTTCT ATAGTTCGTG AGCACTTTTA 
TCGAGCCTTT GATTTTTCTT TGCCGTTTTG TGCTCGTATT ACGGAATTTG 
TATTAGGGGT CATCAAGGGG ATCCCTGTTG TGGGTCACAT TATTGTTGGG 
ATAGAGTCGC TCGTTTCTAG GTATTTAGAG AGTTTCGTGA CCAAGCCGAC 
ATTTGTCTCT GATGTGGTGA GTCTTCTGAA AACAGAGAAA GTTGCTGGTC 
GCGATCACAT TGCTCGTGTA GTGGAGACTT TGAAGAGGCA GAGAGTCGCT 
GTGGCTCCTG AAGATGAGGA TAAGGTCCAT GGGAAGATTC CTGTGCATCC 
TTTCGGGGGA ATCCAACCTG TAGAAGTTCT CACTCTCTAT CCCGAAGTTC 
AAGATGCAAC GTTAGGGCTT GCCTTCTCTA AAATTCGTAA TCGTGTAAGA 
CAGGCGTATT TGCAAGCTCC ACGGCCAAAA CTGCAGAAGA TTTACATCAT 
AGGAAACGAT ATGAATCCTT TTGAAGTTGA CGACTTCTTG CATCTAGCCC 
GTCTCTGTAA TGAAACTCAA AGACTCTATC C TGACGCTAC GATTTCTCTA 
TATCTAACAG CTTCTGGTGG TCGCAATGCT ATGGACAAAA AGAATCGGAA 
GTTACTTAGT GATTGCGAAC TAAACCCCAA GATTGCTTGT TTGGACTTTA 
ATCAGGGTGA TGTAGTCAAA CAAGCAACTT GTGACTGTTG GATGGTGTAT 
CATGGGGAGA ATGATCAAGG TACGTTGAAT CAGATTCAGG AAGAGTTAGA 
AAAGTCAGGG GAGGAAACCC CTTGGATTCA TGTGGGGCAA AAGCCTCTTT 
CACAATCCTT GTGGGATTTC TCTCCATTTT CATCTTTGGA GATGAAGGGA 
GATAAAGAGA AAGCTCTAGA GTACTCTGAA TTAGAAAAAG AACAGCTATA 
TTC TCGATTG GTATACGTAG GAGAGCGCTC TTCGGTTCTT AGTTTGGGGT 
TTGGAGATAG TCGGTCAGGG ATCTTGATGG ACCCAAAACG GGTGCATGCT 
CCCTTATCTG AAGGGCATTA TTGTCATTCC TACCTTGCAG ACTTAGAAAA 
TCCCGGGTTA CAAAAAACAA TTTTAGCGGC ATTTCTGAAT CCTAAGGAGT 
TGAGCAGTAC CATACTGCAA CCTATATCTC TAAATCOTAT CTTAAATAGC 
AAAACTTACT TAAGGCAGCA CTTTGGCTTT TTTGAGAGGA TGAGCAGAAG 
TGATCGCAAT GTGGTTGTCG TTGTATGTGA TTCTTGGTGG GGTACCGACT 
GGAAGGAGGA GCCAAGCTTC CAACACTTTA TTATGGAGCT AGAGTGTCGA 
GGGTATTCGC ACTTCAATAT TTTTGCCTTT AGATCTAATA GCATGTGTGT 
AGAAGAACGT AGGATC TTAA ATGAAAGTTC TCAAGAGAAA GCCTTTACCA 
TGATTTTCTG TGAGGATTCA GTATCFCAAG GAGATATCCG CTGTTTGCAT 
TTGGCGTCTG AAGGAATGCT TTGTGGTAAA GAGTGCTATG CTGTCGATGT 
CTATACGTCA GGATGCGCGA ACTTTATGAT GGAAGAAGTC TTAACTTTGG 
AGCGAGAATC TAATCTGTGG AATAGAAAGC ATGGTCTTTG GAAAAGAGAA 
GTTAGAAAAC AGAAACAAGA AGCTGCTTTG GATCAAGACG AGAGCGAGAT 
TTACGTTTGT AATCAGCTGA CGGCGCAACA GAACTTCGCT TGTTCTTGA 



55 



The PSORT algorithm predicts inner membrane (0.2869). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 125A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
125B) and for FACS analysis. 
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These experiments show that cp6741 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 126 

The following C.pneumoniae protein (PID 4376742) was expressed <SEQ ID 251; cp6742>; 

5 1 LFVSNFIFFV VMPIPYISSW ISTVRQHFVK AFDFSRPFCS RVTNFALGVI 

51 KAIPIVGHIV MGMEWLVSSC VAGIITRSSF TSDWQIVKT EKALGRDHIS 

101 RVAEILQRER GTITPENQDK VHGKFPVCPF GRLKSEETLK LKPGEREGTL 

151 DTVFSPIRTR VTRAYLQAPR PEIRTISIVG SKLKTPQDFS QFVSLANETQ 

in lit ^HPEALVCL YLTGLNRESQ MCDTTTAEKK QYLHNSGLDS RIQCKDSKED 

251 DAGSPENPEIi WIGYYSREQQ HNIDGQYIQQ CLGKSADPIP WIHVTEDTKD 

301 FYYPPNFTSY SHTRQSTDPT SPPRLPESEG DKDSLYGQLS RSYHHEYMLG 

351 LGLKPEDAGL LMDPDRIYAP LSQGHYCHSY LADIENEDLR TLVI,SPFLDP 

401 GNLSSEDLRP VAFNIARLPIj ELDSLFFRLV AGQQEGRNIV TLAHGTPRPE 

1 5 DLDPDSMNIL TRRLQMSGYS YLNIFSYKSR KMIVKERQFF GDRSEGKSFT 

\Vl L ^FEDPISA ADFRCLQLAA EGMVAKDLPS VADICASGCS CIQFSEMQSP 

551 QAIEYRQWEA RVEDEAGEEA REPVIYSQDQ LSSMLTTQQW FVFSLDAWK 

601 QAIWRFRSKG LLTMERKALG EEFLTAIFSY LGSQERNENM GKRTTEEHEV 

651 VISFEELDRM VQVLPAEVPA DSGNDPTRPV PNPDSNPDSS QNEGS* 

The cp6742 nucleotide sequence <SEQ ID 252> is: 

^ 1 TTGTTTGTTT CTAATTTTAT TTTTTTTGTT GTTATGCCAA TTCCCTATAT 

51 TTCTTCTTGG ATTTCTACCG TTCGACAGCA TTTTGTTAAG GCGTTTGATT 
101 TCTCTCGTCC CTTTTGTTCT AGGGTTACGA ATTTTGCTTT AGGGGTCATC 
151 AAGGCCATCC CTATTGTAGG ACATATTGTC ATGGGGATGG AGTGGTTAGT 
9S 201 TTCTTCCTGT GTTGCCGGGA TTATTACTAG GTCCTCCTTT ACCTCAGATG 

^ 251 TCGTTCAGAT TGTAAAGACT GAGAAGGCGT TAGGTCGAGA TCATATATCT 

301 CGAGTGGCGG AGATATTGCA AAGAGAAAGG GGGACCATAA CTCCTGAGAA 
351 TCAAGATAAG GTGCATGGGA AGTTTCCTGT CTGTCCTTTT GGTCGTTTAA 
401 AATCCGAGGA AACTTTAAAA CTTAAGCCGG GAGAAAGAGA GGGAACTTTA 
o 0 f ^1 GATACTGTAT TTTCTCCGAT TCGCACGCGC GTGACTCGTG CGTACTTACA 

501 GGCCCCCCGA CCCGAAATAC GTACGATTTC TATTGTGGGT TCGAAACTTA 
551 AAACTCCTCA AGATTTCTCG CAATTTGTGA GTCTCGCGAA TGAAACGCAG 
601 AGACTGCATC CTGAAGCGTT AGTTTGTCTG TATTTGACAG GCTTGAATCG 
651 CGAATCTCAG ATGTGCGATA CAACTACTGC AGAGAAGAAG CAGTACCTAC 
35 ^^ TCAGG TCTCGACT ^ AGAATCCAGT GCAAAGACAG TAAAGAAGAC 

JJ l*} GACGCTGGCT CTCCTGAAAA TCCCGAACTT TGGATTGGCT ATTATTCACG 

801 AGAGCAACAG CATAATATAG ACGGGCAGTA TATTCAGCAG TGTCTAGGGA 
851 AGAGTGCAGA TCCAATTCCT TGGATTCATG TTACTGAAGA CACAAAGGAT 
901 TTTTATTACC CACCAAACTT TACTTCATAC TCACATACAA GACAATCTAC 
A(\ AGACC CAACA TCGCCACCAA GACTCCCTGA AAGTGAGGGG GATAAGGATT 

40 CCTTGTACGG ACAACTGAGT CGATCGTATC ACCATGAGTA TA^GCTTGGT 

1051 TTGGGATTAA AACCAGAGGA TGCAGGACTC CTGATGGACC CGGATAGAAT 
1101 CTATGCTCCT CTATCCCAAG GGCATTATTG TCA1TCCTAC CTTGCGGATA 
1151 TAGAAAATGA GGATCTACGA ACTTTAGTCC TTTCGCCTTT CCTAGATCCT 
4S GGCAATCTTA GTAGCGAGGA TCTTCGTCCT GTAGCATTCA ATATCGCTAG 

^ "51 ATTGCCATTA GAATTGGACT CGTTATTTTT CCGCCTTGTT GCGGGTCAGC 

1301 AAGAAGGGAG AAACATAGTT ACCCTTGCCC ACGGAACTCC TCGTCCAGAA 
1351 GATCTTGATC CTGACTCAAT GAACATTCTG ACCAGAAGAT TACAAATGTC 
1401 TGGATATAGC TATTTGAACA TTTTCTCCTA TAAATCACGG AAAATGATTG 
<: n ^ TAAAAGAACG TCAGTTCTTT GGAGATCGTT CTGAAGGGAA GTCTTTCACA 

^ U "01 TTGATC TTAT TTGAGGATCC CATTAGTGCA GCAGATTTCC GTTGTTTGCA 

1551 GCTAGCTGCA GAAGGTATGG TTGCTAAGGA TCTCCCCAGC GTAGCAGATA 
1601 TTTGTGCCTC TGGATGTTCC TGCATTCAGT TTTCTGAGAT GCAGAGTCCT 
Irltrlt^ ^A™^ ATGGGAGGCA CGTGTCGAAG ATGAAGCAGG 
55 III, AGAGAAC CAG TAATTTATTC TCAGGATCAA TTGAGCAGCA 

55 1751 TGCTCACTAC ACAACAGAAT TTTGTATTTT CTCTAGATGC TGTGGTAAAA 

1851 GGCAcSp ^ ATTCCG TTCGAAAGG ^ CTTCTTACTA TGGA^I^ 

i9oi aggag™^ S A ^ AGTTCT taactgcgat attttcctat ttagggagtc 
ttt^T^ tgagaatatg gggaaaagaa ctaccgaaga acatgaggtc 
60 agtccctgS n?^ GAGCT AGATCGCATG gtgcaagtcc tcccagccga 

2001 AGTCCCTGCA GATTCAGGCA ATGATCCTAC GCGTCCCGTT CCTAATCCAG 
2051 ATAGTAACCC TGATTCCTCG CAAAATGAAG GCAGTTAG CCTAATCCAG 
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The PSORT algorithm predicts inner membrane (0.2338). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 126A) The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
1 26B) and for FACS analysis. & 

5 These experiments show that c P 6742 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 127 

The following C.pneumoniae protein (PID 4376744) was expressed <SEQ ID 253; cp6744>: 

If) * VI QHLLNFAL EETPSISVQY QEQEKLSPCD HSPEIGKKKR WNKLESFSTY 

III FLDPKNLTSE TFRSVSINFG NSSFGQRWSE FLSRVLHdSc 

III v^n^S AKLLEEG1 * SP EALSLLEEDL RESGYSYLNI DSVSPEGVSK 

201 VQERQIliRRD LQGRSFTVMI TDLPLGSEDI RSLQLASDRI LVSSSLDAAD 

251 ACASGCKVLV YENPNASWAQ ELENFYKQVE RRR* i-VfaSSLDAAD 

15 The cp6744 nucleotide sequence <SEQ ID 254> is: 

*l GTGATACAAC ATCTTCTAAA CTTTGCTCTA GAAGAGACCC CTTCCATTTC 
51 CGTGCAATAC CAAGAACAAG AGAAGCTCTC TCCGTGCGAT CATTCCCCAG 
l^ ^™ GGTAA AAAG&AAAGA TGGAATAAGC TGGAATCCTT CTCCACGTAT 
20 III IZlt^ TTATGTCTGT TAAGGATCAT TATAAGCTGA ATCTAGGAAT 

/U ^01 TC*O**™0C CTGTCAGGGT GGCTTCTGGA TCCCTATAGG GTTTGCGCGC 

251 CTTTATCTTC ACCGTACTCG TGTCCTTCCT ATCTTTTAGA TTTGCAAAAC 
301 AAAGAGCTAC GTCGTTCCCT TCTGTCAACG TTTCTAGACC CTAAAAATCT 
inn ^^ GCGAA ACATTCCG TT CTGTCTCTAT AAACTTTGGC AACTCTTCGT 
05 t*} J?** 0 **** ATGGTCAGAG TTTCTATCTC GTGTTCTGCA CGACGAGAAA 

SOI ?™^ ACG TAGCTCTT ^ TTGTAATGAT GCAAAACTTC rGGAAGAAGG 
501 ATTGTCCCCA GAGGCATTGT CTCTATTAGA AGAAGACTTA AGAGAATCAG 
GGTATTCGTA TCTAAACATT CTCTCGGTGA GCCCCGAAGG AGTCTCCAAG 
GTTCAGGAAC GTCAGATTCT AAGGCGAGAT CTCCAAGGAC GGTCCTTTAC 
TGTCATGATT ACAGATCTTC CTTTAGGTAG CGAAGATATC CGTAGTTTAC 
AATTAGCCTC GGATAGGATT TTAGTCTCCA GTTCTCTTGA TGCCGCGGAT 
GCATGTGCTT CGGGATGTAA AGTCTTAGTC TACGAAAATC CAAATGCATC 



551 
601 
651 

30 701 



801 CTGGGCTCAG G^GGAgI AC^AG SSSS 



851 AG 

The PSORT algorithm predicts cytoplasm (0.3833). 



35 The protein was expressed in Rcoli and purified as a GST-fusion product (Figure 127A) The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
127B) and for FACS analysis. 

These experiments show that cp6744 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 
40 Example 128 

The following C.pneumoniae protein (PID 4376745) was expressed <SEQ ID 255; cp6745>: 

<il rm^ SSWF TWRQHFVNA FDFTHPVCSR ITNFALGI IK AIPVLGHIVM 

101 p^ ISWIP R™*" SDVSSAIKVE O.TRGHNCLAP LEAYL S SLRV 

45 Tci £;f$f DLGKV HGRTPEDPFV DITPTE1VQL LPDEELSTVD EALQGVRSRL 

45 HI LSf WKP MIQDLALVGF GLRDSADLIN FVRLANGVQN HYPHTKVKIjY 

251 SSSSSJ EVQDP" EKGQ LRALGLDPKI ES ^*™ PSVPEVAWD 
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The cp6745 nucleotide sequence <SEQ ID 256> is: 

1 GTGGCTTGTC CAAGTATTTC TTCTTGGTTT ACTGTCGTTC GACAGCATTT 

51 TGTAAACGCC TTTGATTTCA CCCATCCCGT TTGTTCTCGG ATTACAAATT 

5 HI ™ GATCATTAAG GCAATTCCCG TATTAGGACA 

III f^ T ^ AGT GGTTGATT TC CTGGATTCCC AGACACACCG TTCGTCATGG 

251 TCTGATGTCT CTAGTGCTAT TAAAGTAGAA CAAACACGgI 

?ni ^^J TG TTTAGCTCCC CTAGAAGCCT ATTTAAGTAG CTTGAGAGTC 

III ^™° CC AMM »»CT AGGCAAAGTA CACGGGAGAA CCCCAGAAGA 

10 ^ TCCCTTCGTA GATATCACAC CCACAGAAAT TGTCCAACTT CTCCCTGATG 

til ^ CTCTC TACTGTAGA * GAGGCACTGC AAGGCGTTCG TAGTAGGTTA 

^ ACC * ATGCCT ATAGGTCCGT AGAGAAACCT ATGATTCAAG ATCTTGCTCT 

501 TGTGGGTTTT GGTCTCCGAG ATTCTGCGGA CCTCATAAAT TTCGTGCGTC 

III ™ TAATG ° CGTG CAGAAT CACTATCCCC ATACTAAAGT GAAGCTC^T 

15 ™ A ? GA ACTTGGCAGA TGTCTGGGAC TCTGAAATTT CTGAAGAGGA 

^ AAAA ^ G ^ AA CTCCGAGCTC ^GGTTTAGA CCCTAAAATA GAGAGTATAT 

701 CCCTTACGAG TGCAGGTCTT CCTTCAGTGC CAGAAGTCGC TACTGTCGAT 

751 TTTATGATTA CCTGTTACGG GAAAGATCAG GAAGTCCAAG ATCCCTAG 

The PSORT algorithm predicts inner membrane (0.2253). 

The protein was expressed in Ecoli and purified as a GST-fusion product (Figure 128A) The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
1 28B) and for FACS analysis. 

These experiments show that c P 6745 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 129 

The following C.pneumoniae protein (PID 4376747) was expressed <SEQ DD 257; cp6747>: 

1 MMKQGVGQDA KELYTFLSRG NEHYQPCLWF SLEEELGFLF DEKMLCAPLS 

ill on^™J DLVDQHLKDL ILSMFLDPQN XSAGEIiLKVS INVGDSFSPL 

101 QQKDFLSMVL RDETGKNVW VFKGVLSLPA TQVCKLVEEL NSKDYSYLNI 

30 III FSCHGD SSPQ LliFRKELEGT SGRYFTVICA LYLGDTDMRS LQLaIe™ 

251 ™^ f«™» HTNWRPGTFS RHADFADAVD 

251 KLITQANQGI LESGELPLPS KTFWEGFLAF CDRVTVTRHF IPMLDAAIKO 

351 IDKECEALDL KTQCLPSrvS YliEYVTNSHE Sg^iS 

351 EIIADCSPLK EALFPGSDED VPSTSEDPSD DHPSDLEDS* 

The cp6747 nucleotide sequence <SEQ ID 258> is: 

35 1 ATGATGAAAC AAGGAGTCGG GCAGGATGCT AAAGAGCTAT ACACATTTCT 

ATCTCGTGGG AATGAGCATT ACCAACCGTG TCTATGGTTC AGTCTCGAAG 
AGGAACTCGG ATTCCTTTTC GATGAAAAAA TGCTCTGCGC CCCTCTATCT 
GAGGATCACT ATTGCCACTC GTATCTTGTA GATCTAGTGG ATCAACATTT 
AAAGGATTTA ATATTATCGA TGTTTTTAGA TCCTCAGAAT ATCTCAGCAG 
GAGAACTCCT CAAGGTCTCT ATAAACGTTG GAGATTCTTT TTCTCCTCTA 
™ AGAAAG ATTTCCTCTC GATGGTCTTA CGTGATGAAA CGGGAAAAAA 
CGTCGTCGTG GTTTTTAAAG GAGTTCTCTC CTTACCCGCA ACCCAAGTCT 
,.1 S™™ 01 AGAGGAATTG AACTCTAAGG ACTACTCCTA CCTCAATATA 
45 HI 1™™*° ACGGAGATAG TAGTCCTCAG CTTTTATTCC GTAAGGAATT 

45 501 AGAGGGAACT TCAGGGCGTT ATTTTACAGT GATTTGCGCT TTATATCTAG 

til GGGATACAGA CATGCGTAGT TTACAACTTG CTTCTGAAAG GATCATGGTC 
111 lll^tl TTGATCTTOT AGATGCCTAT GCTGCAAGAT GCAAGCTCTT 
651 GAAAATCGAT CATACAAATT GGAGACCTGG AACTTTCAGT CGCCACGCCG 

50 751 TGCTGTAGAC ™agcag ga^tLctc Sm 

50 HI ^ TGATTA C <5CAGGCGAA TCAAGGGATC CTAGAGTCTG GAGAACTCCC 

801 GCTCCCTTCA AAAACCTTCT GGGAAGGATT CTTAGCATTC TGTGATCGAG 

851 TGACTGTCAC GAGACACTTC ATTCCAATGT TAGACGCCGC TaSgCA^ 

901 GCGGTA^GA CTCATAAACA TCCCAGCTTG ATaSSg « 

„ C ™ ACTTG AAAACACAGT GCTTGCCATC TATCGTATCG TACCTTGAAT 

55 1001 ATGTCACAAA CTCTCACGAA AAAACATCGA AAGGCCCGTT CATACAAAAA 
GAGATTATCG CAGACTGTTC tcctcttaaa GAGGCGCTCT TCCCAGGTTC 
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201 

40 251 
301 
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1101 TGATGAAGAT GTTCCCTCTA CCTCTGAGGA TCCTTCAGAT GATCATCCTT 
1151 CGGATCTTGA AGACTCTTAA 

The PSORT algorithm predicts inner membrane (0.1447). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 129 A) and also as 
a his~tagged product. The recombinant proteins were used to immunise mice, whose sera were used 
in a Western blot (Figure 129B) and for FACS analysis. 

These experiments show that cp6747 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 130 

The following C.pneumoniae protein (PID 4376756) was expressed <SEQ ID 259; cp6756>: 



51 
101 



MASGIGGSSG LGKIPPKDNG DRSRSPSPKG ELGSHEISLP PQEHGEEGAS 
GSSHIHSSSS FLPEDQESQS SSSAASSPGF FSRVRSGVDR ALKSFGNFFS 
AESTSQARET RQAFVRLSKT ITADERRDVD SSSAAATEAR VAEDASVSGE 
151 NPSQGVPETS SGPEPQRLFS LPSVKKQSGL GRLVQTVRDR IVLPSGAPPT 
201 DSEPLSLYEL NLRbSSLRQE LSDIQSNDQL TPEEKAEATV TIQQLIQITE 
251 FCCGYMEATQ SSVSLAEARF KGVETSDEIN SLCSELTDPE LQELMSDGDS 
301 IjQNIjIjDETAD DLEAALSHTR LSFSLDDNPT PIDWKPTLIS QEEPIYEEIG 
351 GAADPQRTRE NWSTRLWNQI REALVSLLGM ILSILGSILH RLRIARHAAA 
401 EAVGRCCTCR GEECTSSEED SMSVGSPSEI DETERTGSPH DVPRRNGSPR 
451 EDSPLMNALV GWAHKHGAKT KESSESSTPE ISISAPIVRG WSQDSSVSFI 
501 VMEDDHIFYD VPRRKDGIYD VPSSPRWSPA RELEEDVFGD YEVP1TSAEP 
551 SKDKNIYMTP RLATPAIYDL PSRPGSSGSS RSPSSDRVRS SSPNRRGVPI, 
601 PPVPSPAMSE EGSIYEDMSG ASGAGESDYE DMSRSPSPRG DLDEPIYAWT 
651 PEDNPFTQRN IDRILQERSG GASASPVEP1 YDEIPWIHGR PPATLPRPEN 
701 TLTNVSLRVS PGFGPEVRAA LLSESVSAVM VEAESIVPPT EPGDGESEYL 
751 EPLGGLVATT KILLQKGWPR GESNA* 

The cp6756 nucleotide sequence <SEQ ID 260> is: 

1 ATGGCATCAG GAATCGGAGG ATCTAGTGGA TTAGGAAAGA TTCCACCTAA 

51 AGATAATGGG GATAGAAGTC GATCGCCCTC TCCTAAGGGA GAACTTGGCA 

101 GCCACGAGAT TTCCCTGCCT CCTCAAGAAC ATGGAGAGGA AGGAGCTTCA 

151 GGATCTTCGC ATATACATAG CAGTTCCTCT TTTCTACCAG AAGATCAGGA 

201 GTCTCAGAGC TCTTCTTCGG CAGCTTCTAG CCCGGGATTT TTTTCTCGCG 

251 TACGTTCTGG GGTAGACAGG GCC TTAAAAT CATTTGGCAA CTTTTTTTCC 

301 GCAGAGTCTA CGAGTCAAGC GCGTGAAACG CGACAAGCTT TTGTTAGATT 

351 ATCAAAAACC ATCACCGCGG ATGAGAGACG GGATGTCGAT TCATCAAGTG 

401 CTGCTGCTAC AGAAGCCCGA GTGGCAGAGG ACGCGAGTGT TTCAGGCGAA 

451 AATCCTTCTC AGGGGGTTCC AGAAACCTCT TCTGGACCAG AACCTCAGCG 

501 TTTATTTTCT CTTCCTTCAG TAAAAAAACA GAGCGGTTTG GGTCGGTTGG 

551 TACAGACAGT TCGCGATCGC ATAGTACTTC CTAGTGGGGC TCCACCTACA 

601 GACAGCGAGC CTTTAAGTCT CTACGAGCTA AACCTCCGTT TGAGTAGTTT 

651 ACGTCAGGAG CTCTCTGACA TACAAAGTAA TGATCAGTTG ACTCCAGAGG 

701 AAAAAGCAGA AGCCACAGTT ACCATACAAC AGCTGATCCA AATTACAGAA 

751 TTCCAATGCG GCTATATGGA GGCAACACAA TCTTCGGTAT CTCTAGCAGA 

801 AGCTCGTTTT AAGGGGGTAG AAACTAGTGA TGAGATCAAT TCCCTCTGTT 

851 CAGAACTGAC AGATCCTGAG CTTCAAGAAC TCATGAGTGA TGGAGACTCT 

901 CTTCAAAACC TATTAGATGA GACTGCCGAC GATTTAGAAG CTGCTTTGTC 

951 CCATACTCGA TTGAGTTTTT CTTTAGACGA TAATCCAACT CCGATAGACA 

1001 ATAATC CAAC TCTGATTTCT CAAGAAGAGC CTATTTATGA GGAAATCGGA 

1051 GGAGCTGCAG ATCCTCAAAG AACTCGGGAA AACTGGTCTA CAAGATTATG 

1101 GAATCAGATT CGCGAGGCTC TGGTTTCTCT TTTAGGAATG ATTTTAAGCA 

1151 TTCTAGGGTC CATCTTGCAC AGGTTGCGTA TTGCTCGTCA TGCAGCTGCT 

12 01 GAAGCAGTGG GTCGTTGTTG CACGTGCCGA GGAGAAGAGT GTACTTCTTC 

1251 TGAAGAGGAC TCGATGTCGG TGGGGTCTCC TTCAGAAATT GATGAAACTG 

1301 AAAGAACGGG CTCTCCGCAT GACGTTCCAC GCAGAAATGG AAGTCCACGT 

1351 GAAGATTCTC CATTGATGAA TGCCTTAGTA GGATGGGCAC ATAAGCACGG 

1401 TGCTAAAACC AAGGAGAGTT CAGAATCAAG TACCCCGGAA ATTTCGATTT 

1451 CTGCTCCCAT AGTGAGAGGT TGGAGTCAAG ACAGTTCCGT CAGTTTTATT 
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1501 GTTATGGAAG ATGATCATAT TTTCTATGAT GTTCCTCGTA GAAAAGATGG 

1551 AATCTATGAC GTTCCTAGTT CCCCTAGATG GAGTCCTGCG CGAGAGTTGG 

1601 AAGAGGATGT TTTTGGAGAT TATGAAGTTC CTATAACCTC TGCTGAACCA 

1651 TCTAAAGACA AGAACATCTA CATGACACCT AGATTAGCAA CTCCTGCTAT 

1701 CTATGATCTT CCTTCACGTC CAGGATCGTC TGGAAGCTCA CGTTCTCCGT 

1751 CTTCAGATCG CGTACGAAGC AGCTCACCAA ATAGACGGGG TGTGCCTCTT 

1801 CCTCCAGTTC CTTCACCTGC TATGAGTGAG GAGGGGAGCA TTTATGAGGA 

1851 TATGAGCGGT GCTTCAGGTG CAGGTGAAAG TGATTATGAA GATATGAGCC 

1901 GTTCCCCCTC TC CTAGAGGC GACTTGGATG AACCCATATA TGCTAATACT 

1951 CCTGAAGATA ATCCATTTAC TCAGAGAAAT ATAGATAGAA TTTTACAGGA 

2001 GAGGTCAGGC GGTGCTTCCG CTTCTCCTGT AGAGCCTATT TATGATGAGA 

2 051 TCCCATGGAT TCATGGCAGG CCCCCTGCTA CACTTCCAAG ACCCGAGAAT 

2101 ACATTGACTA ATGTTTCGCT TAGAGTGAGC CCAGGGTTTG GACCAGAAGT 

1>q 2151 AAGAGCCGCT TTGCTTAGCG AGAGCGTGAG TGCTGTTATG GTCGAAGCAG 

° 2201 AGAGTATTGT TCCTCCAACA GAGCCGGGGG ACGGAGAATC AGAATATCTA 

2251 GAGCCCTTAG GGGGACTTGT AGCTACAACG AAAATCTTAC TACAAAAAGG 

23 01 ATGGCCTCGT GGAGAGTCGA ATGCTTAG 

The PSORT algorithm predicts inner membrane (0.3994). 

The protein was expressed in Rcoli and purified as a GST-fusion product (Figure 130A). The 
20 recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
BOB) and for FACS analysis. 

These experiments show that cp6756 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 131 

25 The following C.pneumoniae protein (PXD 4376761) was expressed <SEQID 261; cp6761>: 

1 MTVAEVKGTF KLVC LGCRVN QYEVQAYRDQ LTILGYQEVL DSEIPADLCI 

51 INTCAVTASA ESSGRHAVRQ I>C RQNPTAH I WTGCLGESD KEFFASLDRQ 

101 CTLVSNKEKS RI>IEKIFSYD TTFPEFKIHS FEGKSRAFIK VQDGCNSFCS 

151 YC 1 1 PYLRGR SVSRPAEKIL AEIAGWDQG YREWIAGIN VGDYCDGERS 

iU 201 LASLIEQVDR IPGIERIRIS SIDPDDITED LHRAITSSRH TCPSSHLVXjQ 

251 SGSNSILKRM NRKYSRGDFL DCVEKFRASD PRYAFTTDVX VGFPGESDQD 

301 FEDTLRIIED VGFIKVHSFP FSARRRTKAY TFDNQIPNQV IYERKKYLAE 

351 VAKRVGQKEM MKRLGETTEV IA/EKVTGQVA TGHSPYFEKV SFFWGTVAI 

401 NTLVSVRLDR VEEEGLIGEI V* 

35 The cp676 1 nucleotide sequence <SEQ ID 262> is: 

CGGAAGTCAA AGGAACATTT AAGCTGGTCT GTTTAGGCTG 
CAGTATGAGG TC C AAGC ATA TCGCGACCAG TTGACTATCT 
AGAGGTCCTG GATTCTGAAA TCCCTGCAGA TTTATGCATA 
GTGCTGTCAC AGCTTCTGCT GAGAGTTCGG GTCGTCATGC 
TTATGTCGTC AGAACC CTAC AGCACATATT GTTGTCACAG 



40 



45 



50 



1 


ATGACGGTTG 


51 


TCGGGTGAAT 


101 


TAGGTTACCA 


151 


ATCAATACGT 


201 


TGTGCGTCAG 


251 


GTTGTTTGGG 


301 


TGCACACTTG 


351 


TTCCTATGAT 


401 


AGTCTCGAGC 


451 


TACTGCATTA 


501 


GAAGATTTTA 


551 


TTGTAATTGC 


601 


TTAGCCTCTT 


651 


TCGAATTTCC 


701 


CCATCACCTC 


751 


TCGGGGTCGA 


801 


AGATTTTTTA 


851 


CCTTTACTAC 


901 


TTTGAAGATA 


951 


TAGTTTCCCT 


1001 


ATCAGATTCC 


1051 


GTTGCTAAGA 



TTTTATTAAA GTTCAAGATG GCTGTAATTC TTTTTGCTCG 



55 951 TAGTTTCCCT TTCAGTGCTC GTCGTCGTAC TAAGGCATAT ACTTTTGATA 
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TACAGAGGTG CTTGTTGAGA AAGTAACGGG GCAGGTTGCT ACGGGTCACT 
CTCCTTATTT TGAAAAGGTT TCTTTCCCTG TTGTAGGAAC GGTAGCTATC 

i«i ^ C ^ TCTAG ^TGTGCG TCTTGATAGG GTAGAGGAAG AAGGGCTGAT 
1251 TGGGGAGATT GTATGA 

The PSORT algorithm predicts inner membrane (0.1574). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 131 A) and also as 
a his-tagged product. The recombinant proteins were used to immunise mice, whose sera were used 
in a Western blot (Figure 13 IB) and for FACS analysis. 

These experiments show that cp676l is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 132 

The following C.pneumoniae protein (PID 4376756) was expressed <SEQ ID 263; cp6766>: 



15 



20 



25 



1 


MATSVPVTSS 


51 


IFPQVGLWAV 


101 


AQKEWTTQQD 


151 


WILKLEPLST 


201 


LTjFLIEEQYY 


251 


REECS PEDAL 


301 


SLQRKLPETA 


351 


CYESANQRLD 


401 


IliENESGFLC 


451 


DSMLSQFASR 


501 


YLTAVPQRMW 



. .™. um . ui i, uovvrwuKKy AKKgiLSNTE 

SLYEYPLSYL IDWAVLLDCV RGTEISLEDQ ADYTVCLQGL 



The cp6766 nucleotide sequence <SEQ ID 264> is: 



l 

51 



ATGGCAACCT CTGTTCCTGT AACTTCATCT ACTTCTGTAG GAGAGGCTAA 
CTCCTCCAAC GAAAGATTTA CTGAACGAAC ATCGCGAATG TATTACGCAG 
101 CTTTAGTCCT AGGGGCTTTG AGCTGTTTAA TTTTTATTGC TATGATTGTC 
151 ATTTTCCCAC AGGTCGGATT GTGGGCTGTG GTCCTCGGGT TTGCTCTTGG 
™ 20J - ATGTTTACTT TTAAGCTTAG CTATCGTTTT TGCTGTCTCC GGTCTCGTTT 

M ™ TAGGCAAGAC TTTAGAACCT AGTCGAGAAG CGACTCCTCC AGAAATTGTT 

GCGCAAAAGG AGTGGACTAC ACAACAAGAT GTCTTAGGGA ATGAGTATTG 
GCGTTCCGAG TTGATTTCCT TGTTCTTACG AGGGGATCTC CACGAATCTC 
TGATTGTTGA TTCTAAGGAT CGATCTTTAG ATATTGATCA GAGTTTACAA 
K 451 AATATATTGA AACTTGAGCC CCTATCTACG ACACTTTCGC TGTTAAAGAA 

35 501 AGATTGTGTC CACATCAATA TCATTTTACA TTTAGTGAGA CAGTGGAACT 

^ ™ GGAGT GGATCTTAG * CCTGAAGTCA CTGCGCACGC CGAGGAACTT 
601 CTACTCTTTT TGATAGAAGA GCAGTATTAC TCTCCTGATA TTTTGAAATT 
651 GATTCGCTAC GGAGATGCTT TACAAGCAAC GTCTCCTTTG ATGGATTGPG 
40 ™ CAGATTCAGG TTCCTTTAGT GTAGACGCAG ACGGGGTATT tIgCTGTCGC 

40 751 agagaagaat gttctcctga ggatgctttg gcgcaattcg atcttctt?t 

GGCGTTGGAA AATCCCGACA GACGCTTCTT AAAGGATTCT TTTCTTACCT 

ACATTTGGTC gtcttcattt tttgagaagt ttttacatcg ccatctagag 

q*1 A ^^ CAAA GAAAGCTCCC AGAGACAGCG ATCGATGTCG CCCGCTATGA 

4 , 951 AGCACAAATA CAAACATTTC TCTCTCGCTA TTTTCAGAAG CTCGATTTGA 

^ ™ CGCAAT GTCCTTAGA T TGGGGATATA ACTGTGCTGA GGGAGAAAAA 

1051 TGTTATGAGA GCGCAAATCA AAGATTAGAC AACCTATTTA TTGCTTTTTC 
llll CCTGCTATGA AGCGGCTCTT TGACAAAT^ S 

1151 TACGGGTAGA TCGTAGGCAG ATTCGTGAGC AGATTCTTTC GAACACTGAA 

5ft "01 ATCTTAGAAA ATGAGTCAGG GTTCCTCTGC AGTTTGTATG AATATCCTTT 

1301 A ™ GATTGGG c *g™c T AGACTGTGTT C^CGCTACCG 

1301 AAATCTCTCT AGAAGATCAG GCCGATTACA CCGTTTGTTT GCAAGGCTTG 

, G ™ ATGT TATCTCAATO TGCGAGTCGT TTACAGTCTG gSSS 

1401 ATTGAATCCT agagatgttt taagtgaaca ggctgcggtt atgcttgttc 

„ ATGGCTTGGC AGCACAGGGC GTGTCGTTTC AAGGATTGaI AGCTTTGATG 

55 1501 TATTTGACAG CCGTTCCCCA AAGAATGTGG TTAGGAGCM 2gS£ot 

llll TGGGAGACTA £ CTGTC ' I " :[ " I ' A AT CGGATGAA AGAATTTCTT GGGGAATCTC 
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The PSORT algorithm predicts inner membrane (0,6158). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 132A) and also as 
a his-tagged product. The recombinant proteins were used to immunise mice, whose sera were used 
in a Western blot (Figure 132B) and for FACS analysis. 

5 These experiments show that c P 6766 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 133 

The following ^pneumoniae protein (PID 4376 804) was expressed <SEQ ID 265; cp6804>: 

in 1 MSNQLQPCIS LGCVSYINSF PLSLQLIKKN DIRCVLAPPA DLLNLiLIBGK 

X 51 LDVALTSSLG AISHNLGYVP GFGIAANQRI LSVNLYAAPT FFNSPQPRIA 

101 ATLESRSSIG LLKVLCRHLW RIPTPHXLRF ITTKVLRQTP ENYDGLLLIG 

151 DAALQHPVLP GFVTYDLASG WYDLTKXPFV FALLLHSTSW KEHPLPNIjAM 

201 EEALQQFESS PEEVLKEAHQ HTGLPPSLLQ EYYALCQYRL GEEHYESFEK 

251 FREYYGTLYQ QARL 

15 The cp6804 nucleotide sequence <SEQ ID 266> is: 

1 ATGTC TAACC AACTCCAGCC ATGTATAAGC TTAGGCTGCG TAAGTTATAT 

51 TAATTCCTTT CCGCTGTCCC TACAACTCAT AAAAAGAAAC GATATTCGCT 

101 GTGTTCTTGC TCCCCCTGCA GACCTCCTCA ACTTGCTAAT CGAAGGGAAA 

on 151 CTCGATGTTG CTTTGACCTC ATC CCTAGGA GCTATCTCTC AT AAC TTGGG 

ZU 201 GTATGTCC CC GGCTTTGGAA TTGCAGCAAA CCAACGTATC CTCAGTGTAA 

251 ACCTCTATGC AGCTCCCACT TTCTTTAACT CACCGCAACC TCGGATTGCC 

301 GCAACTTTAG AAAGTCGCTC CTCTATAGGA CTCTTAAAAG TGCTTTGTCG 

351 TCATCTCTGG CGCATCCCAA CTCCTCATAT CCTAAGATTC ATAACTACAA 

401 AAGTACTCAG ACAAACCCCT GAAAATTATG ATGGCCTCCT CCTAATCGGA 

ZD 451 GATGCAGCGC TACAACATCC TGTACTTCCT GGATTTGTAA CCTATGACCT 

501 TGCCTCGGGG TGGTATGATC TTACAAAGCT ACCTTTTGTA TTTGCTCTTC 

551 TTCTACACAG CACCTCTTGG AAAGAACATC CCCTACCCAA CCTTGCGATG 

601 GAAGAAGCCC TCCAACAGTT CGAATCTTCA CCCGAAGAAG TCCTTAAAGA 

o A 651 AGCTCATCAA CATACAGGTC TGCCCCCTTC TCTTCTTCAA GAATACTATG 

^ U 701 CCCTATGCCA GTACCGTCTA GGAGAAGAAC ACTACGAAAG CTTTGAAAAA 

751 TTCCGGGAAT ATTATGGAAC CCTCTACCAA CAAGCCCGAC TGTAA 

The PSORT algorithm predicts inner membrane (0.060). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 133A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
35 133B) and for FACS analysis. 

These experiments show that c P 6804 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 134 

The following ^pneumoniae protein (PID 437 68 05) was expressed <SEQ ID 267; cp6805>: 

40 1 MSSLLSCGRI EPTRVTCSLK TYLEDTSQNQ LSTRLVRASV IFLCALLIIL 

51 VCVALSSLIP SIMALATSFT VMGIj I LF VMS LLGDVAIISY LTYSTVTSYR 

101 ONKRAFEIHK PARSVYYEGV RHWDLGRSSI. GTGEIPIVRT LFSPFQNHGL 

151 NHALAAKIFL FMEHFSPEPP NEPLVDVJACL IRDFRPHVSS LCFVIEKQGS 

A a 201 SLRTKEGNTI CEAFRSDYDA HFAMVDCYRL IHSKLIIEKM GLKNIDIIPS 

4D 2 51 VHVREDYPSR PGEGYREGLL RMYGGKGAL* 

The cp6805 nucleotide sequence <SEQ ID 268> is: 
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1 


ATGTCATCAC 


51 


TAGCTTAAAG 


101 


GTCTAGTTCG 


151 


GTTTGTGTGG 


201 


CTCTTTTACG 


251 


ACGTTGCAAT 


301 


CAAAATAAGA 


351 


CGAGGGGGTC 


401 


AGATTCCTAT 


451 


AACCATGCCT 


501 


TGAGCCACCG 


551 


TTAGGCCTCA 


601 


TCGCTGAGGA 


651 


TTACGACGCC 


701 


AGTTGATTAT 


751 


GTCATGGTTC 


801 


AGGCCTATTA 



The PSORT algorithm predicts inner membrane (0.711). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 134A) The 
20 recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
1 34B) and for FACS analysis. 

These experiments show that cp6805 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 135 



25 



The following C.pneumoniae protein (PID 4376813) was expressed <SEQ ID 269; cp6813>: 

\ "* G * SRTESS QVSVLSYVPR DKEIAPKKQF TIAKISTLAI LASLALGALV 
T ^^^ Vt ° NPVFLALLIT TALFSWTFL VYHQMTSKVS SNWQKVLEQN 
1 FKPLGKAWQE KNVDCYSNEM QFYNNHLNPK FKVAIQTDAS QPFOPTFLTC 
LRVIEKNQST GIIFNPVGPT NLIDNTATNIi STILYSTLKD KSVWDTCKOR 
PSPTEVRWK LPNEALDQTF NLNLSSAEKK SIMTfS 
CGPKSEELEW QQEYYRQALL AYENCLKAAI ESHAAXVALP LFTSVYEVPP 
EEILPKEGTF YWDNn-pnawr. rojT.r.m™ rr 



51 
101 
151 

30 201 

251 

" o.at,iN ) _jj I v«jij. r.£>irwiJ.VAJjP IiFTSVYEVPP 

351 SsSSeE* VTO, ' BrQ * PC K^AIQN TALRYPQRSL LVILQDPFNT 

The cp6813 nucleotide sequence <SEQ ID 270> is: 

35 * ATGTCAGGAC CCTCACGTAC TGAGAGCTCT CAAGTTTCTG TACTATCCTA 

. GATAAAGAAA TTGCTCCTAA AAAACAGTTT ACCATAGCAA 

AAATATCCAC TCTTGCAATC CTAGCTTCTT TAGCTTTAGG AGCTTTGGTG 
^ ^ GAATCT CTTTAACGAT AGTATTAGGG AACCCTGTAT TTTTGGCTCT 
4f ) TCTC ™CC ACGGCCCTCT TCTCAGTTGT AACCTTCTTA GTCTACcIcc 

40 "I AAATGACCTC AAAGGTATCT TCTAACTGGC AGAAAGTTCT AGAGCAAAAC 

301 TTCAAGCCTT TGGGAAAAGC GTGGCAAGAA AAAAACGTAG ACTGCTACTC 
2oi ™^ GATG CAATTTT ACA ATAATCACCT GAACCCTAAG TTCAAGGTAG 
401 CGATACAAAC AGATGCGTCT CAACCATTTC AGCCTACTTT CTTAACTgSa 
45 501 IZcccT^ TCGAAAAAAA TCAATCCACA GGGATCATCT 

501 AGGCCCAACG AATCTGATCG ACAACACTGC AACGAACCTC TCTACTATCC 
fiOl o^° CAC CCTAAAAGA T AAAAGCGTGT GGGATACATG CAAGCAACGC 
«1 ^^ GGT ° CCGCAAAAG G AGAAGACCCC TTTTCCCCTA CCGAAGTGAG 
701 ttlnnl^ CTTCCAAACG AAGCTCTAGA TCAAACGTTT AATCTAAATT 
50 753 AGAAAAGAAA AGTATTCTTC CGACCTTTTT AGGCCACGTA 

If} ^ CGGCCCTA AATCTGAAGA GTTACCA&AT CAGCAAGAAT ATTATCGCCA 
801 AGOTTTACTA GCGTACGAGA ACTGCCTTAA AGCAGCTATA GAAAGTCATG 
TGCTCT TCCT CTCTTTACTT CGGTCTATGA AGTOCCTCcI 
GAAGAGATTC TTCCTAAAGA AGGCACTTTC TATTGGGACA ACCAAACTCA 

1001 GCTATCCTCA SOS?*"* TATTGGACGC TATTCAAAAT ic2£S£ 
? , GG T_ CCTCA AAGATCTTTA CTTGTTATAC TCCAAGATCC TTTTAATACT 
1051 ATAGAATCAC AAAGTCGTTC TGAGGAGTAA JTAATACT 



851 
901 
951 

55 1001 
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The PSORT algorithm predicts inner membrane (0.4291). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 135A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
135B) and for FACS analysis. 

5 These experiments show that cp6813 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 136 

The following ^pneumoniae protein (PID 4 3 7 6 8 4 4 ) was expressed <SEQ ID 271; cp6844>: 

- n 1 MWRWLRFLrl IFILGRAVFP LRASESFSWE TSTCLTVLGI PFIDIIIiTTN 

XU 51 EDFVAQCGLQ IGTISSTNNA KIKEIPLIYK EKFPEASISF KRKEPLNLSQ 

101 SHTjSDLGILC MRNGETYAEG MANKENGPAL KQPKDLRLVL RCPNQPDTLL 

151 YSEKEAEKGI ETNTCLCNQG YTLLDGQLIL YGDSIEKFLK ETKRKNNHTL 

201 VDLiC DSQWT TFLGRFWSLL NYVQVLFIiSE DSAKILAGIP DLAQATQIiLS 

1C 251 HTVPIiLFIYT NDSIHIIEQG KESSFTYNQD LTEPILGFLF GYINRGSMEY 

J J 3 01 CFNCAQSSLG ET* 

The cp6844 nucleotide sequence <SEQ ID 272> is: 

1 ATGTGGCGCG TTGTCCTCAG ATTCCTTATA ATTTTTATCT TGGGAAGAGC 

51 CGTCTTCCCT CTAAGAGCTT CAGAAAGCTT CTCCTGGGAA AC ATCGAC CT 

101 GTTTAACAGT GCTAGGGATT CCTTTCATAG ATATTATCCT CACAACGAAT 

151 GAGGACTTTG TTGCC CAGTG CGGCCTGCAA ATAGGAACCA TTTCTTCGAC 

2 01 TAATAACGCA AAAATAAAAG AAATTTTTTT GATATATAAG GAAAAATTTC 

251 CAGAAGCCTC TATCAGTTTC AAACGAAAAG AACCTCTAAA CCTTTCCCAA 

301 TCCCATCTCT CCGATTTAGG TATTTTATGT ATGCGTAACG GAGAAACTTA 

351 CGCTGAGGGA ATGGCAAATA AAGAAAACGG ACCCGCTCTA AAACAACCCA 

-5 401 AGGATCTAAG ATTAGTTTTA CGTTGTCCTA ACCAACCAGA TACCCTGCTC 

451 TACTCGGAAA AAGAAGCAGA AAAGGGCATA GAAACAAATA CTTGCCTATG 

501 CAATCAGGGA TACACACTCC TGGATGGGCA ATTGATTCTC TACGGGGATA 

551 GTATAGAAAA GTTTCTGAAA GAGACCAAAA GAAAGAATAA CCACACGCTT 

6 01 GTTGATCTTT GTGACTCACA AGTCGTGACC ACGTTCCTCG GTCGCTTTTG 

30 651 GTCTCTTCTA AACTACGTTC AAGTTCTTTT CCTATCTGAA GACTCCGCTA 

701 AAATTCTTGC GGGCATCCCA GACCTAGCTC AAGCTACGCA ATTGCTTTCC 

751 CACACCGTAC CTTTGCTTTT TATTTATACC AACGATTCTA TTCACATCAT 

801 AGAACAAGGC AAAGAAAGTA GTTTTACCTA TAACCAAGAT TTAACAGAGC 

851 CCATTTTAGG ATTTCTCTTT GGTTACATAA ATCGCGGCTC TATGGAATAC 

35 901 TGCTTTAATT GTGCACAGTC TTCATTAGGA GAAACCTAA 

The PSORT algorithm predicts inner membrane (0.1786). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 136A) and also in 
his-tagged form. The recombinant proteins were used to immunise mice, whose sera were used in a 
Western blot (Figure 136B) and for FACS analysis. 

40 These experiments show that cp6844 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 137 

The following ^pneumoniae protein (PID 4377201) was expressed <SEQ ID 273; cp7201>: 

1 VLVGICPSLY PEHPRSFYYK VSGDIGSRFD DRGFVNSGVE TLPYSSGSFG 

45 51 IFWISFTDPT FNFAIWTFM RTAGINEVSR PMTQDTETSk IEMRDLSEQQ 

101 EANNTDSLEQ EESLMGIVGH TVGGVSMTVT SSPNIFYRIQ TLLGLPETLA 

151 EAEENPTFPN STIDSliAEIM MNLVRISDAV SIFWIFPIVD TTYNGVLLAV 
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201 CIGFFGINGI CSTFLMLTNP RSRRDRWRNL RIMVLCYRSL GSGMNLFDLS 

251 NWRMAARRH VTSCTVALYA MVTLFGWTVA IQDALQYGFP SVRPAFYRYC 

301 LRHRYCLTQR MEDSLQTTGT RFQVTRTHLE DQQMVASILN LSVFGLFFGF 

351 VGLMTTFGGIi E1SPSCRWDA ANNRTVGIF* 

5 The cp7201 nucleotide sequence <SEQ ID 274> is: 

1 GTGCTCGTTG GTATCTGTCC TTCTCTATAT CCAGAACATC CTCGCTCCTT 

51 TTATTATCGT GTTTCTGGAG ATATAGGCTC CCGATTCGAC GATAGAGGAT 

101 TTGTAAACTC TGGAGTCGAA ACCCTGCCAT ACTCTTCAGG CAGCHTTGGG 

1f| 151 ATTTTTTGGA TCTCGTTTAC GGATCCCACA TTTAATTTTG CTATCGTAAA 

1U 201 TACCTTTATG CGAACTGCAG GGATCAATGA AGTCTCTAGA CCCATGACAC 

251 AAGATACAGA AACTTCATTG ATAGAAATGA GAGACCTAAG TGAACAACAA 

3 01 GAAGCGAATA ACACAGATTC TTTAGAGCAA GAAGAGAGCT TAATGGGTAT 

351 TGTAGGACAT AC TGTGGGAG GAGTTTCCAT GACCGTGACC TCCAGTCCAA 

^ 401 ATATCTTTTA TCGTATACAA ACACTTCTGG GACTGCCAGA GACTCTTGCA 

lD 451 GAAGC TGAAG AAAATCCTAC CTTCCCAAAT TC TACTATAG ATAGCCTTGC 

501 AGAAATAATG ATGAACCTCG TAAGGATCTC TGATGCTGTC TCTATTTTCT 

551 GGATTTTTCC TATCGTAGAT ACTACATATA ATGGAGTTTT ATTAGCCGTC 

601 TGTATCGGCT TCTTCGGAAT CAATGGGATT TGTTCCACGT TCCTTATGCT 

9n 651 TACGAATCCA CGCTCTCGTC GAGATAGATG GAGGAATTTA CGCATCATGG 

ZU 701 TTCTTTGCTA TCGTTCTTTG GGAAGCGGAA TGAATCTCTT TGATCTTAGC 

751 AATAATGTGC GCATf^CAGC ACGTAGGCAT GTGACATCAT GTACAGTAGC 

801 TCTCTATGCT ATGG1CACTC TATTTGGATG GACAGTAGCA ATACAAGATG 

851 CTTTGCAATA TGGTTTCCCT AGCGTTCGGG ATGCCTTCTA TAGATATTGC 

9 ^ 901 TTACGCCACA GATATTGCTT AACTCAAAGA AACGAAGACT CTCTGCAAAC 

Z0 951 TACAGGAACG CGCTTTCAGG TTACCCGTAC ACAT CTAGAA GATCAACAGA 

1001 TGGTGGCTTC TATTTTGAAT TTGAGTGTTT TTGGGCTCTT TTTTGGATTC 

1051 GTAGGGCTAA TGACCACGTT TGGAGGATTA GAAATCTCAC CATCTTGTCG 

1101 GTGGGATGCA GCAAATAACC GAACGGTAGG TATTTTTTAG 



30 



The PSORT algorithm predicts inner membrane (0.3102). 

The protein was expressed in Exoli and purified as a GST-fusion product (Figure 137A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
137B) and for FACS analysis. 

These experiments show that cp7201 is a surface-exposed and immunoaccessibie protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



35 Example 138 



The following ^pneumoniae protein (PID 4377251) was expressed <SEQ ID 275; cp7251>: 

1 MAPIHGSNAF VEDILHSHPS PQATYFSSTR AQKLHEFKDR HPVLTRIASV 

51 IIKIFKVLIG LIILPLGIYW LCQTLCTNSI LPSKNXiLKIF KKQPNTKTLK 

101 TNYLHALQDY SSKNRVASMR RVPILQDWL IDTLEICLSQ APTNRWMLIS 

qU 151 LGSDCSLEEI ACKEIFDSWQ RFAKLIGANI LVYNYPGVMS STGSSSLKDL 

201 ASAHNICTRY LKDKEQGPGA KEIITYGYSL GGLIQAEALR DQKIVANDDT 

251 TWIAVKDRCP LFISPEGFHS CRRIGKLVAR LFGWGTKAVE RSQDLPCLEI 

301 FIiYPTDStiRR STVRQNKLLA PELTLAHAIK NSPYVQNKEF IEVRLSSDID 

351 P I DSKTRVAL ATPILKKLS* 

45 The cp725 1 nucleotide sequence <SEQ ID 276> is: 

ATGGCTCCAA TTCACGGAAG TAATGCGTTT GTTGAGGATA TTTTACATTC 
CCACCCTTCT CCACAAGCGA CTTATTTTTC TTCAACACGC GCCCAAAAAC 
TTCATGAGTT TAAAGACAGG CATCCCGTGC TTACACGGAT TGCTTCTGTA 
ATTATTAAAA TTTTTAAAGT TCTGATAGGG CTGATCATCC TTCCCTTAGG 
AATCTACTGG CTATGTCAAA CGCTTTGTAC AAACTCGATT CTCCCTTCCA 
AGAATTTATT AAAAATTTTC AAGAAGCAAC CCAACACTAA AACCTTAAAA 
ACTAATTATT TGCATGCTTT GCAAGATTAT TCCTCGAAAA ACCGCGTTGC 
TTCCATGAGA CGAGTTCCTA TCCTCCAGGA TAATGTTCTC ATCGACACTT 
TGGAAATATG CCTTTCACAA GC AC CTACGA ATCGTTGGAT GCTCATTTCT 
TTAGGAAGTG ACTGTAGCTT GGAAGAAATC GCTTGTAAGG AGATCTTTGA 



50 



55 



l 

51 
101 
151 
201 
251 
301 
351 
401 
451 
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501 TTCTTGGCAA AGATTTGCCA AGTTGATAGG GGCCAATATA CTCGTTTATA 

551 ACTACCCCGG AGTCATGTCC AGCACAGGGA GCAGCAGCCT AAAGGACCTA 

601 GCATCAGCTC ATAATATTTG TACAAGATAC CTTAAAGATA AAGAACAGGG 

651 CCCTGGAGCA AAAGAAATCA TTACCTATGG GTACTCCCTA GGAGGTTTGA 

701 TACAAGCAGA AGCATTGCGA GACCAGAAGA TTGTTGCAAA CGATGATACT 

751 ACTTGGATAG CAGTCAAAGA TAGGTGTCCT CTCTTTATAT CTCCAGAAGG 

8 01 TTTCCACAGT TGCAGACGCA TAGGAAAGCT AGTAGCTCGT CTTTTTGGCT 

851 GGGGGACCAA AGCCGTAGAG AGAAGCCAAG ACCTTCCCTG CCTAGAAATT 

in 901 TTTCTCTATC CTACGGATTC CTTACGAAGA TCAACAGTCA GACAGAACAA 

AU 951 GCTCTTAGCA CCTGAACTTA CTCTCGCTCA TGCGATAAAA AATAGTCCCT 

1001 ATGTTCAAAA TAAAGAATTT ATAGAAGTAC GATTATCGTC TGATATCGAT 

1051 CCCATCGACA GCAAAACAAG AGTGGCTCTT GCCACACCAA TTTTGAAAAA 

1101 GCTCTCTTAG 

The PSORT algorithm predicts inner membrane (0.4545). 

15 The protein was expressed in Rcoli and purified as a GST-fusion product (Figure 138A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
138B) and for FACS analysis. 

These experiments show that c P 7251 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

20 Example 139 

The following ^pneumoniae protein (PID 43772 88) was expressed <SEQ ID 277; cp7288>: 

1 MHMSNPISLF SPAELIAKYN LIPKTSPIYP RRTELIILEE NACQTRLTNV 

51 AQVLHPSSLP SMSKKILNPC GCSGGPIiCWV 1LNILAFIIT SVLFIILLPV 

101 KIj I VAGLRLtF MPLPPKKIVE DLSEPTTEET NEVIQPFIFA LQALLF EDNK 

^ 151 LRSFKIVEQS VGKAPLPNPF LNRLVAISPQ ESQEAMRKIP DLCSQLKKVL 

2 01 KSLGVLTPEW KHMLKYFEGL KNEHDSNPDK KTFPILIKLL IEALTGKSSL 
251 PKTPSTKEKM QAALFIASSC KTCKPTWGEV ITRSLNRLYS IANEGDNQLL, 
301 IWVQEFKERE LMSIQDGDDA EEYRFAAQQH GERYTEAIEQ VLRNESAAKL 

o n 351 QWHVINTMKF FHGKNLGLVT EHLQDTLGAL TLRQTTVDTH QGREDADLSA 

JU 401 ALFLNKYLNS GNQLVNSVFK SMQKADFETK ALIREFALDI LYASLRIiPQT 

451 SAHTEVFSTL LMDPETYEPN KACIAYLLYV LKIIEL* 

The cp7288 nucleotide sequence <SEQ ID 278> is: 

1 ATGCATATGT CTAACCCCAT CTCTTTGTTT TCCCCTGCAG AGTTAATAGC 

51 AAAGTACAAT TTAATTCCAA AAACTTCGCC GATTTATCCT CGGAGGACGG 

101 AACTTATTAT CTTGGAAGAA AATGCGTGTC AAACACGCCT AACCAACGTG 

151 GCTCAGGTCC TACATCCTTC TAGCCTATTC AGTATGTCAA AAAAAATACT 

201 GAATC CCTGC GGGTGCTCTG GTGGTCCCTT ATCTTGGGTG ATTCTCAACA 

251 TCCTAGCATT TATTATTACT TCAGTACTGT TTATCATTCT TTTACCGGTG 

3 01 AATCTCATCG TAGCAGGTCT TCGTCTCTTC ATGCCTCTTC CCCCTAAAAA 
4U 351 AATCGTAGAG GATTTAAGTG AACCTACTAC TGAAGAAACG AATGAGGTCA 

401 TTCAACCCTT CATTTTCGCT TTGCAAGCGT TGCTTTTTGA GGATAACAAA 

451 CTTCGCTCTT TTAAAATTGT TGAACAAAGT GTAGGCAAAG CACCCTTACC 

501 TAATCCCTTT TTAAATAGAC TAGTAGCAAT TTCGCCGCAA GAAAGCCAAG 

A - 551 AAGCCATGCG GAAGATTCCG GATCTATGCT CACAACTGAA AAAAGTATTA 

^ 601 AAGTCTCTAG GCGTGCTAAC TCCAGAATGG AAGCACATGC TGAAGTACTT 

651 TGAGGGACTG AAAAACGAAC ATGATAGTAA TCCTGATAAA AAGACGTTCC 

701 CAATATTGAT CAAGCTCCTC ATAGAAGCTC TTACTGGAAA GTCCTCTTTA 

751 CCCAAAACTC CTAGTACAAA GGAAAAAATG CAAGCGGCCT TATTTATTGC 

_ A 801 AAGTTCTTGC AAGACTTGTA AGCCGACTTG GGGAGAAGTC AT AAC C AG AT 

^ U 851 CTCTTAACAG ACTCTATAGT ATAGCTAATG AAGGAGACAA TCAGCTTCTG 

901 ATTTGGGTTC AAGAGTTTAA AGAACGAGAG CTGATGTCCA TCCAAGATGG 

951 TGATGATGCT GAAGAGTATC GGTTTGCGGC TCAGCAACAC GGTGAGCGTT 

1001 ACACAGAGGC AATAGAACAA GTTCTACGAA ACGAGTCAGC AGCCAAACTA 

1051 CAATGGCATG TGATCAACAC TATGAAATTC TTCCATGGGA AAAATCTCGG 

^ 1101 TCTAGTTACA GAACACCTAC AAGATACTCT CGGCGCCCTA ACTTTACGTC 

1151 AAACTACAGT GGACACACAT CAAGGCAGAG AAGACGCTGA TTTGTCAGCT 

1201 GCTCTTTTCC TAAATAAGTA TTTAAATTCT GGAAATCAAC TTGTTAATAG 
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12 51 CGTCTTTAAA TCCATGCAAA AAGCAGATCC AGAAACCAAA GCTO?TAATCC 

13 01 GTGAGTTTGC TCTAGATATA TTATATGCAT CCTTACGGCT TCCTCAAACT 
13 51 TCCGCTCATA CCGAGGTCTT TTCTACACTC TTAATGGACC CAGAGACCTA 
1401 TGAACCTAAT AAAGC TTGTA TCGCCTACTT GCTCTATGTA TTAAAGATCA 
1451 TCGAACTATA A 

The PSORT algorithm predicts inner membrane (0.5989). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 139A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
139B) and for FACS analysis. 

These experiments show that cp7288 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 140 

The following Cpneumoniae protein (PID 43 773 59) was expressed <SEQ ID 279; cp7359>: 

ic * MPGSVSSPPL SPVIVRERVP SSSGSDLIQP HAVLKISILI FALVTILGIV 

51 LWLSSALGA LPSLVLTVSG CIAIAVGLIG LGILVTRLIL STIRKVDAMG 

101 YDAAVKEEQY LSRIRELESE NREIRDRNRA VEDQCAHLSE ENKDLRDPEY 

151 LHGMTERLIA SLEIENQALV AENILLKDWN ASLSRDFRAY KQKFPLGALB 

201 PWKEDIACJM EQNLFLKPEC IAMVKSLPLE TQRLFLYPKG FQSLVNRFAP 

90 251 RSRFFQTPKY EYNSRNBNED GKVAAVCARL KKEFFSAVLG ACSYEELGGI 

^ 301 CERAVALKET LPLPEAVYDT LVQEFPNLLT AESLWKEWCF YSYPYLRPYL 

351 SVDYCKRLFV QLFEELCLKL FTTGSPEDQA LVRLF S YYRN HIPAVLiASFG 

401 LPPPETGGSV FVLLPKQENL LWSQIEVLAT RYLKDTFVRN SEWTGSFEMM 

451 FSYNEMCKEI SEGRIRFAED YETRHSEEFP PSPLSEEGEG EEFJJPPCSEE 

501 EVSVLERPDL DVDSMWVWHP PVPKGPL* 

25 The cp7359 nucleotide sequence <SEQ ID 280> is: 

1 ATGCCAGGTT CTGTGTCATC ACCTCCTTTG TCTCCTGTAA TTGTCCGTGA 

51 AAGGGTCCCA TCCTCTTCAG GATCCGACCT CATACAGCCT CATGCTGTTT 

101 TAAAGATCTC CATCCTAATT TTTGCGCTTG TGACAATTTT AGGAATTGTT 

o n 151 CTTGTAGTGT TGTCTAGTGC TTTAGGAGCT CTTCCTAGTT TAGTTTTGAC 

^ U 201 GGTTTCTGGT TGTATTGCAA TAGCTGTAGG CCTGATTGGT TTAGGGATTC 

251 TTGTGACACG GCTGATTCTC TCTACGATCA GAAAAGTAGA TGCCATGGGT 

3 01 TATGATGCTG CGGTCAAAGA AGAGCAGTAT TTGTCACGTA TCAGAGAATT 

351 AGAGTCTGAA AATAGAGAGA TTAGAGATAG AAATCGTGCT GTCGAAGATC 

nc 401 AGTGTGCCCA TTTATCCGAA GAGAACAAGG ACCTTAGGGA TCCCGAATAT 

^ D 451 CTACATGGAA TGACTGAAAG GCTCATTGCG AGCTTAGAAA TAGAGAATCA 

501 AGCTCTCGTA GCTGAGAACA TTCTTCTCAA AGACTGGAAT GCAAGC CTAT 

551 CTAGAGATTT CCGCGCATAT AAGCAAAAAT TTCCTCTTGG GGCATTAGAA 

601 CCCTGGAAAG AAGATATTGC ATGTATCATG GAACAAAATC TCTTTTTAAA 

651 AC CGG AATGT ATCGCGATGG TTAAGTCTCT TCCATTAGAG ACGCAACGGC 

4U 701 TGTTTTTATA TCCAAAAGGA TTTCAGTCTT TAGTTAATCG ATTTGCTCCG 

751 CGGTCTCGCT TTTTCCAGAC TCCAAAGTAT GAATATAACA GTAGGAATGA 

801 AAATGAGGAC GGAAAGGTAG CCGCAGTGTG CGCCCGTTTG AAAAAAGAAT 

851 TCTTCAGTGC TGTTTTAGGA GCCTGTAGTT ACGAAGAACT AGGGGGCATT 

901 TGTGAAAGAG CAGTAGCACT TAAAGAGACG TTGCCATTOC CTGAAGCTGT 

4D 951 CTATGATACC CTAGTTCAGG AGTTCCCAAA TCTTCTTACT GCTGAGAGTT 

1001 TATGGAAAGA ATGGTGCTTC TATTCCTATC CCTACCTTCG TCCCTATCTT 

1051 TCTGTGGATT ACTGTAAGAG GTTATTTGTA CAACTTTTTG AGGAACTCTG 

1101 CCTAAAGCTT TTTACAACGG GATCTCCAGA AGACCAAGCT TTGGTTCGCC 

1151 TTTTCTCTTA CTATAGGAAT CATATTCCCG CAGTCTTGGC CTCATTTGGT 

0U 1201 TTGCCCCCGC CTGAGACAGG GGGGTCTGTA TTTGTATTGC TACCAAAACA 

1251 AGAAAACCTT CTTTGGAGTC AAATTGAGGT GCTGGCTACA AGGTATCTCA 

1301 AAGATACCTT CGTGAGAAAC TCAGAATGGA CGGGCTCTTT CGAGATGATG 

1351 TTTTCTTATA ACGAGATGTG TAAGGAGATC TCC GAAGGAA GGATTCGTTT 

1401 TGCTGAAGAC TATGAAACGA GGCATTCCGA AGAATTCCCT CCTTCCCCTC 

1451 TCTCTGAAGA AGGAGAGGGC GAAGAATTCC TTCCTCCTTG CTCTGAAGAA 

1501 GAGGTTTCGG TTCTTGAGCG ' CCCAGATCTA GATGTAGACT CTATGTGGGT 

1551 CTGGCATCCG CCGGTCCCTA AGGGACCTCT TTAA 
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The PSORT algorithm predicts inner membrane (0.7453). 

The protein was expressed in Rcoli and purified as a GST-fusion product (Figure 140A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
140B) and for FACS analysis. 

5 These experiments show that cp7359 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 141 

The following C.pneumoniae protein (PID 4 3 77 3 7 4) was expressed <SEQ ID 28 1 ; cp7374>: 

tn 1 MDKQSSGNSG CIWHPFTQSA LDSTPIKIVR GEGAYLYAES GTRYLDAISS 

iU 51 WWCNLHGHGH PYITKKLCEQ AQKLEHVIFA NFTHEPALEI, VSKZjAPLDPE 

101 GLERFFFSDN GSTSIEIAMK IAVQYYYNQW KAKSHFVGLS NAYHGDTFGA 

151 MSIAGTSPTT VPFHDLFLPS STIAAPYYGK EELAIAQAKT VFSESNIAAF 

201 IYEPLLQGAG GMLMYNPEGL KEII1KI1AKHY GVLCIADEIIi TGFGRTGPLF 

1 ^ 251 ASEFTDIPPD IICLSKGLTG GYLPLALTVT TKEIHDAFVS QDRMKALLHG 

1D 301 HTFTGNPLGC SAALASLDLT LSPECLQQRQ MIERCHQEFQ EAHGSLWQRC 

351 EVLGTVLALD YPAEATGYFS QYRDHLNRFF LERGVLLRPD GNTLYVLPPY 

401 CIQEEDLRII YSHLQDALCL QPQ* 

The cp7374 nucleotide sequence <SEQ ID 282> is: 

1 ATGGACAAGC AATCATCAGG GAATTCAGGG TGTATCTGGC ACCCCTTCAC 

/U 51 TCAATCTGCA TTAGATTCTA CACCCATAAA GATTGTAAGG GGAGAAGGTG 

101 CTTACCTCTA TGCGGAATCA GGAACAAGAT ATCTTGATGC GATATCTTCA 

151 TGGTGGTGCA ACCTCCACGG TCATGGGCAT CCCTACATTA CAAAAAAATT 

201 ATGTGAGCAA GCACAGAAGT TAGAACATGT GATCTTC GC A AATTTCACCC 

251 ATGAACCGGC TCTAGAGCTC GTATCGAAAC TCGCTCCCCT CCTTCCTGAA 

Z7> 301 GGTCTAGAAC GTTTCTTTTT CTCTGACAAC GGATCAACGT CTATCGAAAT 

351 AGCAATGAAA ATTGCTGTGC AATATTACTA CAATCAAAAC AAGGC TAAGA 

401 GCCATTTTGT TGGACTCAGC AATGCCTATC ACGGAGATAC ATTTGGAGCT 

451 ATGTCGATAG CTGGCACGAG CCCTACTACA GTTCCCTTTC ATGATCTTTT 

o A 501 TCTTCCTTCC AGTACAATTG CTGCTCCCTA TTATGGCAAG GAAGAGCTTG 

551 CCATTGCCCA AGCAAAAACA GTCTTTTCTG AAAGCAATAT CGCAGCGTTT 

601 ATCTATGAGC CGCTATTGCA AGGTGCTGGA GGGATGTTAA TGTATAATCC 

651 CGAAGGCCTA AAGGAGATTC TCAAGCTTGC CAAGCATTAC GGGGTTCTCT 

7 01 GTATTGCTGA TGAAATTCTT ACTGGCTTTG GCCGTACGGG TCCACTGTTT 

751 GCTTCTGAAT TTACAGACAT TCCTCCTGAC ATTATCTGTC TTTCTAAAGG 

^ 801 TCTTACAGGA GGCTATCTCC CTCTAGCCTT GACAGTAACC ACTAAAGAAA 

851 TTCATGATGC CTTTGTCTCC CAAGATCGGA TGAAGGCACT GCTTCATGGC 

901 CATACCTTCA CAGGAAATCC TTTAGGC TGT AGTGCTGCCC TCGCTTCTTT 

951 GGATCTCACC CTATCTCCAG AATGCCTACA ACAAAGGCAA ATGATAGAAC 

1001 GGTGTCATCA AGAGTTTCAA GAAGCTCATG GTTCCCTATG GCAACGGTGT 

40 1051 GAGGTTCTGG GCACGGTACT CGCTCTAGAT TACCCTGCAG AAGCTACAGG 

1101 ATATTTTTCA CAATATAGAG ACCATCTCAA TCGCTTTTTC TTAGAACGTG 

1151 GAGTCCTTCT TCGTCCTTTA GGGAACACAC TGTATGTGCT GCCCCCCTAC 

12 01 TGTATCCAAG AAGAAGATCT CCGGATTATT TATTCTCACC TACAGGATGC 

12 51 CCTATGTCTA CAACCACAGT AA 

45 The PSORT algorithm predicts cytoplasm (0.2930). 

The protein was expressed in Rcoli and purified as a GST-fusion product (Figure 141 A) and also in 
his-tagged form. The recombinant proteins were used to immunise mice, whose sera were used in a 
Western blot (Figure 141B) and for FACS analysis. 

These experiments show that c P 7374 is a surface-exposed and immunoaccessible protein, and that it 
50 is a useful immunogen. These properties are not evident from the sequence alone. 
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Example 142 

The following C.pneumoniae protein (PID 4377377) was expressed <SEQ ID 283; cp7377>: 

^™* WSL EDI REIYHPP VFEIi IHKANA ILRSNFLHSE LQTCYLISIK 
TGGCVEDCAY CAQSSRYHTH VTPEPMMKIV DWERAKRAV EL^TOVcS 
AAWRNAKDDR YFDRVLAMVK SITDLGAEVC CALGMLSEEQ aSSS 

iZKSS ™™ ™**>*°4 



l 

51 
101 
151 

201 BEDRIKLLHV LATRDHIPES VPVNLLWPID GTPLQDQPPI SfSbvSE?! 

251 



3 01 DEDAEMIKLL GLIPRPSFGI ERGNPCYANN S* 

10 The cp7377 nucleotide sequence <SEQ ID 284> is: 



l 

51 
101 
151 
201 
251 
301 
351 



J!?™ G AAACTGTATC CTGGTCATTA GAAGACATCC GCGAAATTTA 
TCACACTCCC GTATTTGAGC TGATTCACAA AGCCAATGCC ATATTGCG^a 
GTAOTCCT CCATTCAGAA CTGCAGACTT GCTATCTGAT 
ACTGGTGGAT GCGTTGAAGA TTGCGCCTAC TGTGCCCAAT CTTCCCGCTA 

tcatacccac gtcacaccag aacctatgat gaaaattgta gaSotgg 

AAAGGGCAAA ACGTGCTGTA GAGCTAGGCG CCACTCGTGT GTCTCtS 

gctgcctggc gcaatgctaa ggacgatcga tactotgata gagtcctcIc 

am ^ GGTGAAA AGTATCACAG ATCTCGGAGC CGAGGTTTGT TOTCCTTOAG 
20 £\ GCATGCTCTC CGAAGAGCAA GCTAAAAAAC TGTATGATGC AgScTOTAT 

GCCTACAATC ATAATTTAGA CTCTTCTCCG GAATTCTATG AAACTATAAT 
^S™? T TCTTATGAAG ATCGCCTCAA CACTC TTGAT GTAGTAAATA 
AATCTGGCAT TAGTACATGC TGCGGTGGTA TTGTAGGTAT GGGAGaItCT 
GAAGAAGACC GTATAAAGCT TCTTCATGTT CTTGCAACaI GAGATCATAT 
CCCAGAATCC GTACCTGTAA ATTTACTTTG GCCGA^GA^ G^ACGCC^ 
TGCAAGACCA GCCTCCGATT TCTTTCTGGG AAGTCTTGCG AACCATAGCA 
ACGGCACGGG TTGTTTTCCC CAGATCCATG GTACGACTTG C^CAGotcG 
CGCTTTCCTC ACAGTAGAAC AACAAACCTT ATGTTTTCTA gSS^CA 
QOI ^^ CATATT CTATG «AGAT AAACTGTTGA CTGTAGAAAA CAATOMATA 
30 H\ ^S AAGATG CTGAAAT <^ CAA&CTTTTA GGCTTAATCC CTC^CCTOC 

M 951 MTTGGAATA GAAAGAGGTA ACCCATGTTA TGCCAACAAT TCCtS 

The PSORT algorithm predicts cytoplasm (0.2926). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 142A) and also in 
has-tagged form. The recombinant proteins were used to immunise mice, whose sera were used in a 
Western blot (Figure 142B) and for FACS analysis. 

35 These experiments show that c P 7377 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 
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Example 143 



The following C.pneumoniae protein (PID 4377407) was expressed <SEQ ID 285; c P 7407>: 
40 si ZpvTt^ mc ™cew evttteettr qsasdiseea gssggaapit 

ini" rvnQvafpoo^ KRVQFNTAQG DESTIHMIQE AGELVDSILS HRRTQGCTEY 

ill AVAtSS QRCGSFGRLI CGTYKACCLD REDNQVAGLV SSI 
201 x^ LVHKN TILSEEQKNE FRQHCSEAKT QLYgSsLS 

251 ™ ^^ LDDSL VQAVLSFIAT RSWEKTIESE EASGTSSASN 

ton STRIPACYIL NTSPLiTTSRL SCGSRDARRP SSVGAEPOYV AKrvT^w™* 

3S3K S522S SSSS = 1 
i2 SEE SSSK 3552 "™ 

The cp7407 nucleotide sequence <SEQ ID 286> is: 



J A ^^ TGCC CAA ATAATTC TTGGTTCAGA ATGTC'GGAA ATTTCAAr-TV 

11 c^taS gaagtaa ^ caacagaaga aacaacgcgg 

151 ACGCAAr^ CGAAGAAGCT GGTTCGAGTG GAGGAGCTGC TCCTmS 

151 ACGCAACCTA CTAAAATTAC AAAAGTAGAG AAACGTGTCC AATtWac 
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2 01 TGCTCAAGGT GATGAAAGTA CAATACACAT GATC CAAGAA GCAGGAGAAT 
251 TGGTAGACTC CATTCTATCA CATAGACGAA CGCAAGGATG TACAGAGTAT 

3 01 TGTTATGACA GTTACGCAAC TGGATGTGGT CAGCGTTGCG GATCTTTTGG 
351 AAGACTCATT TGTGGAACGT ATAAAGCGTG TTGCTTAGAC AGAGAGGATA 
401 ATCAGGTTGC TGGACTTGTC CATGAATGCG AACAGACCCA TGGTCCTATT 
451 GCCGTTGCTT TAGCTGCTAA AACTATGGGC CTCAACTTAA TGGAACTTGT 
501 AGAAAAAAAC ACTATTTTGT CTGAAGAACA GAAAAATGAA TTTAGACAGC 
551 ATTGCTCGGA AGCTAAAACC CAACTCTATG GAACGATGCA GAGCCTTTCT 
601 CAAAACTTTT TCCTTGAAGG AGTCAACAGC ATTAGAGAAC GCGGTCTAGA 
651 CGATTCACTA GTCCAAGCCG TGCTAAGCTT TATTGCTACA AGGTCTTGGG 
701 AAAAAACTAT AGAATCAGAG GAAGCCTCAG GAACATCTTC TGCTTCTAAT 
751 TCTACACGCA TTCCTGCGTG CTATATCTTA AATACGAGCC CCTTAACGAC 
801 GTCACGCCTA TCCTGTGGAT CAAGAGATGC GCGACGCCCA TCTTCAGTCG 
851 GTGCAGAGCC CCAGTACGTA GCAAAAAAAT ACAATGACAA TGGCATGGCC 
901 AGACAATTAG GAAAAATCCA AGTCACCAAT CTAAAAACAG GAGATTTTTC 
951 AGCTTTAGGT CCTTTTGGTC TCCTGATTGT GAAAATGCTG AATAGCTTTC 

1001 TCTTATCTGC ATCACAAAGC AC ATCTTC TA TTCTAAAGCA CACAGGTGGA 

1051 GAAATATGTT ATACGTGCCC AAATTTTCGT GATATCGTCG TTTTATTGAT 

1101 GTTAGCGATT GGCTATTGCC CTGCAAATAC CGATGAGACA TCTGTCGTAG 

1151 ATATACACAT GATAGATGAT CCGATTATGA CCATCTTCTA TCGACTACAA 

1201 TACAGCTATA GAACAGGGAA AACTTCAGCA TCGTTTTTAA AAAAGAAACC 

1251 CTCATTAGTA AGACAGGAAA GTCTTGATTG TCCTACCCCT GCAGAATCTG 

1301 TCCCTCTCAT GTC.V 3TCTC GAAGAAGAAG ATGAAAATGA AGATGATGAT 

1351 GAGGATGGGA ATTTGGCGTA TCAACAGCGT ATCCTTGAAT GCTCGGGTCA 

1401 TTTACAAACT CTATTTTTAG GGATAAAAAT AAACAAAGAA TAA 

The PSORT algorithm predicts inner membrane (0.1319). 

The protein was expressed in Rcoli and purified as a GST-fusion product (Figure 143A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
143B) and for FACS analysis. 

These experiments show that cp7407 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone; 

Example 144 

The following C.pneumoniae protein (PID 4376432) was expressed <SEQ ID 287; cp6432>: 

1 MTRSTIESSD SLCSRSFSQK ItSVQTLKNLC ESRLMKITSIj VIAFLTLIVG 

51 GALIALAGGG VLSFPLGLIL GSVLVLFSSI YLVSCCKFFT LKEMTMTCSV 

101 KSKINIWFEK QRNKDIEKAL ENPDLFGENK RNVGNRSAKN QLEMILHETD 

151 GIILKRYMKG AKMYFYjQ* 

The cp6432 nucleotide sequence <SEQ ID 288> is: 

1 ATGACTAGAA GTACTATTGA AAGCAGTGAT TCGCTATGCT CAAGGTCTTT 
51 TTCTCAAAAA TTAAGTGTCC AGACATTAAA AAATCTCTGT GAAAGTAGAT 
101 TAATGAAGAT CACTTCTCTT GTGATTGCTT TCCTAACTCT AATTGTGGGG 
151 GGTGCTCTTA TAGCTTTAGC AGGAGGGGGG GTTCTTTCTT TCCCTCTTGG 
201 GCTAATCTTA GGAAGCGTAC TCGTTTTGTT TTCTTCTATC TATTTAGTCT 
251 CTTGTTGTAA ATTTTTTACT TTAAAAGAGA TGACAATGAC CTGTAGTGTC 
301 AAATCTAAAA TCAATATATG GTTTGAAAAG CAACGAAACA AAGACATCGA 
351 AAAGGCATTA GAGAATC CAG ATCTCTTTGG AGAAAATAAG AGAAATGTTG 
401 GAAATCGTTC GGCAAGAAAT CAACTAGAAA TGATCTTACA CGAGACTGAC 
451 GGAATTATTT TGAAAAGATA TATGAAAGGA GCTAAAATGT ACTTTTATTT 
501 ATGA 

The PSORT algorithm predicts inner membrane (0.5394). 

The protein was expressed in Kcoli and purified as a his~tagged product (Figure 144A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
144B) and for FACS analysis. 
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These experiments show that cp6432 is a surface-exposed and imrnunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 145 

The following ^pneumoniae protein (PID 4376433) was expressed <SEQ ED 289; cp6433>: 

1 MNWVPKTIDH VDPESEIDIR KWSCYKLIK ECQPEFRSLI SELLGVIRCG 

51 LRLLKRSKYQ EQARTVSDED APLFCLTRSY YQDGYLTPLR AGPRDLINHY 

101 IHLRRRENPK HFFSPKHPCY YARLAFNESV CVYRELFDIE RLTKMYVEGD 

151 YSKEQEKNLQ AIL.SFVKTLD EGKDFLIEHK DTDLI GRGFT DVFCT* 

The cp6433 nucleotide sequence <SEQ ID 290> is: 

1 ATGAATTGGG TTCCAAAAAC AATAGACCAT GTAGATCCAG AATCAGAGAT 

51 AGATATACGT AAAGTCGTCT CCTGCTATAA GTTGATAAAA GAATGTCAAC 

101 CTGAATTTCG ATCTCTTATA AGTGAATTAC TAGGAGTGAT TCGGTGTGGC 

151 TTAAGACTAT TAAAACGTTC TAAGTATCAA GAACAGGCTA GAACTGTATC 

,r 201 TGATGAAGAT GCACCTCTTT TCTGCCTGAC TCGTTCTTAT TATCAAGATG 

lD 251 GTTATCTCAC GCCATTAAGA GCAGGACCTC GTGATCTTAT AAATCACTAT 

301 ATACACTTGC GTCGCCGAGA GAATCCTAAG CATTTTTTCA GTCCTAAGCA 

351 TCCATGTTAT TATGCTCGAT TGGCTTTTAA TGAGTCAGTG TGTGTCTATA 

401 GAGAACTCTT TGATATAGAG CGACTTACAA AAATGTATGT CGAGGGTGAT 

9n 451 TATTCTAAAG AACAAGAGAA AAACCTACAG GCTATTCTTA GTTTTGTGAA 

ZU 501 AAC TCTAGAT GAAGGAAAGG ACTTTCTTAT TGAACATAAA GATACCGATC 

551 TCATTGGGAG AGGTTTTACT GATGTGTTCT GCACTTAA 

The PSORT algorithm predicts cytoplasm (0.4068). 

The protein was expressed in Kcoli and purified as a his-tagged product (Figure USA). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
25 145B) and for FACS analysis. 

These experiments show that c P 6433 is a surface-exposed and imrnunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 146 



The following C.pneumoniae protein (PID 4376643) was expressed <SEQ ID 291; cp6643>: 

30 1 MGYLPVSATD VLFESPAAPL 1NSANTQNQK L I ELKGKQQ A ESSPRTITSV 

51 ILEVLWIGC CLIVLSLLAI RPALQFTLET GHPAAIAVLA VSGTILLVAV 
101 IILFCFLAAV PFAAKKTYKY VKTVDDYASW HSHQQTPTLG TIFSGIVYAE 
151 SQAQL* 

The cp6643 nucleotide sequence <SEQ ID 292> is: 

35 1 ATGGGATATC TTCCAGTATC TGCTACGGAC GTTCTTTTTG AAAGTCCAGC 

51 CGCTCCCTTA ATCAATAGCG CAAACACACA AAATCAGAAA CTCATAGAAC 

101 TCAAGGGGAA GCAGCAAGCT GAGTCTTCTC CACGGACAAT CACTTCTGTC 

151 ATATTGGAAG TTCTCCTAGT GATCGGATGC TGCCTCATAG TTCTTAGTTT 

201 ATTGGCAATC CGCCCTGCTC TGCAATTCAC TCTAGAAACT GGACATCCAG 

4U 251 CTGCCATTGC AGTCCTTGCT GTCTCAGGAA CAATTCTATT GGTGGCTGTT 

301 ATCATCTTGT TTTGCTTTCT AGCAGCTGTG CCATTCGCTG CTAAGAAAAC 

-3 51 TTATAAATAT GTTAAGACGG TTGATGACTA TGCTTCTTGG CATTCTCATC 

401 AGCAAACACC GACCCTAGGC ACTATCTTTT CAGGTATCGT CTATGCAGAA 

451 TCCCAGGCGC AATTATAG 

45 The PSORT algorithm predicts inner membrane (0.6859). 
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The protein was expressed in Kcoli and purified as a his-tagged product (Figure 146A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
146B) and for FACS analysis. 

These experiments show that cp6643 is a surface-exposed and immunoaccessible protein, and that it 
5 is a useful immunogen. These properties are not evident from the sequence alone. 

Example 147 

The following C.pneumoniae protein (PID 43 76722) was expressed <SEQ ID 293; cp6722>: 

1 VSSTLNGVFP SSLPEESADL FITNKEIVAL GEKGNVFLTH SIPMHIAAIT 

- n 51 ILVIVALAGI AIICLGCYSQ SIL.LIAVGIV LTII»TLI»CLQ ALVGFIKFIR 

1U 101 QLPQQLHTTV QFIREKIRPE SSLQLVTNAQ RKTTQDTLKL YEELCDLSQK 

151 EFKLQSTLYQ KRFELSHKNE KTNQN* 

The cp6722 nucleotide sequence <SEQ ID 294> is: 

1 GTGTCTAGTA CTTTAAACGG GGTATTTCCC TCATCCCTTC CGGAAGAGTC 

51 TGCTGATTTA TTCATTACGA ATAAGGAGAT CGTAGCTTTG GGGGAGAAGG 

1J 101 GCAATGTTTT TCTCACCCAC TCCATTCCTA TGCATATTGC TGCGATTACG 

151 ATCTTAGTGA TTGTAGCTCT TGCTGGAATC GCTATTATCT GTTTGGGTTG 

201 CTATAGCCAA AGCATTCTGT TGATTGCCGT TGGCATTGTT CTTACTATTT 

2 51 TGACTCTTCT CTGCCTACAA GCC TTGGTAG GATTTATTAA ATTCATCCGG 

3 01 CAGCTCCCTC AGCAGCTCCA TACGACAGTA CAATTTATCA GGGAGAAGAT 
li) 351 TCGACCTGAA TCCTCTCTAC AGCTTGTAAC CAATGCACAG AGAAAAACCA 

4 01 CTCAAGATAC GCTAAAGTTA TACGAAGAAC TCTGCGACCT CTCACAAAAA 
451 GAGTTCAAAC TGCAATCAAC TCTTTATCAA AAACGTTTTG AGCTTTCTCA 

5 01 CAAGAATGAA AAGACAAATC AAAACTAG 

The PSORT algorithm predicts inner membrane (0.6668). 

25 The protein was expressed in Kcoli and purified as a his-tagged product (Figure 147A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
147B) and for FACS analysis. 

These experiments show that cp6722 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

30 Example 148 

The following ^pneumoniae protein (PID 4 3 7 7 2 5 3 ) was expressed <SEQ ID 295 ; cp7253>: 

1 MSELAPCSTG LQMVPHTQVH HALDTRRVIL TIAACLSLIA GIVLVGLGAA 

51 AILPSLFGVI GGMILILFSS IALIYLYKKT REVDQIALEP LPEMISKDQS 

. 101 IIDFVKTRDY ASLEKKATFA YTHTHYYDGS MVFYREIPRF MLGSYLALRK 

35 151 DMDRQALF* 

The cp7253 nucleotide sequence <SEQ ID 296> is: 

1 ATGAGCGAGC TCGCCCCCTG CTCGACAGGA TTGCAGATGG TCCCCCATAC 

51 GCAGGTCCAT CATGCCCTTG ATACGCGGAG AGTCATTCTA ACGATAGCCG 

101 CCTGTCTGTC TTTAATTGCA GGAATCGTGT TGGTTGGCTT AGGTGCTGCA 

40 151 GCAATCCTGC CCTCGCTTTT TGGAGTCATT GGAGGAATGA TTCTTATTCT 

201 GTTTTCTTCG ATCGCCCTCA TTTATTTATA CAAGAAGACA AGGGAGGTGG 

251 ATCAGATTGC TCTGGAGCCT CTTCCTGAGA TGATTTCTAA AGATCAAAGC 

3 01 ATTATAGATT TTGTAAAGAC ACGAGACTAT GCATCTTTAG AAAAGAAAGC 

351 GACCTTTGCT TATACTCATA CTCATTATTA CGATGGAAGC ATGGTCTTCT 

45 401 ATAGGGAGAT CCCTAGATTT ATGTTAGGCT CTTATCTCGC GCTTCGCAAA 

451 GACATGGACC GCCAAGCTCT TTTTTGA 

The PSORT algorithm predicts inner membrane (0.5394). 
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The protein was expressed in E.coli and purified as a his-tagged product (Figure 148 A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
148B) and for FACS analysis. 

These experiments show that cp7253 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 149 

The following C.pneumoniae protein (PID 43762 64) was expressed <SEQ ID 297; cp6264>: 

1 VISGLLFLLV RREVPTVRSE EIPRGVSVTP SEEPALEKAQ KEPETKKILD 

51 RLPKELDQLD T Y I QEVFACIi ERLKDPKYED RGLLTEAKEK LRVFDWEKD 

101 MMSEFLDIQR VLNEEAYYVE HCODPLENIA YEIFSSQELR DYYCAGVCGY 

151 LPSGDARADR LKRSVKEVMD RFMRVTWKSW EASVMLDHSY GVARELFKKA 

201 VGVLEESVYK ILFKSYRDAF YECEKAKIQR DGRFKWL* 

The cp6264 nucleotide sequence <SEQ ID 298> is: 

1 GTGATTTCGG GACTTCTATT CCTTCTAGTA AGACGAGAGG TTCCGACAGT 

51 ACGTTCAGAG GAAATTCCCA GAGGGGTTTC TGTGACCCCT TCTGAAGAGC 

101 CTGCTCTAGA GAAGGCTCAA AAAGAACCGG AGACAAAGAA AATTTTAGAT 

151 CGGTTGCCGA AGGAATTGGA TCAGTTAGAT ACGTATATTC AGGAAGTGTT 

2 01 TGCATGTTTA GAGAGGCTGA AGGATCCTAA GTACGAAGAT CGAGGTCTTT 

251 TAACAGAGGC GAAGGAGAAA CTTCGAGTTT TTGACGTTGT TGAGAAAGAT 

301 ATGATGTCAG AGTTTTTAGA CATACAACGA GTGTTGAATG AGGAAGCATA 

351 TTATGTAGAA CATTGTCAAG ATCCCCTAGA GAATATAGCC TACGAGATTT 

401 TCTCTTCCCA AGAGCTTCGT GATTACTACT GTGCAGGGGT GTGTGGGTAT 

451 TTGCCTTCTG GGGATGCTCG AGCGGATCGA TTAAAGAGAT CAGTTAAGGA 

501 GGTAATGGAT CGCTTTATGA GGGTGACCTG GAAATCTTGG GAGGCATCAG 

551 TCATGTTGGA TCATAGCTAT GGGGTAGCGC GAGAGTTATT CAAGAAGGCA 

601 GTAGGAGTAC TAGAGGAGAG TGTCTATAAA ATTCTGTTTA AGAGC TAT AG 

651 AGATGCGTTT TATGAATGTG AGAAGGCAAA GATCCAGAGG GATGGGCGTT 

701 TCAAATGGTT ATAG 

The PSORT algorithm predicts cytoplasm (0.2817). 

The protein was expressed in E.coli and purified as a his-tagged product (Figure 149 A), The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
149B) and for FACS analysis. 

These experiments show that cp6264 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 150 

The following ^pneumoniae protein (pid 43 7 62 66) was expressed <SEQ ID 299; cp6266>: 

1 MLLLISGALP LTLGIPGLSA AISFGLGIGL SAXiGGVLMIS GLLCLLVKRE 

51 IPTVRPE8IP EGVSLAPSEE PALQAAQKTL AQ1.PKELDQL DTDIQEVFAC 

101 LRKLKDSKYE SRSFLNDAKK ELRVFDFWE DTLSEIFELR QIVAQEGWDL 

151 NFLINGGRSL MMTAESESLD LFRVSKRLGY LPSGDVRGEG LKKSAKEIVA 

201 RLMSLHCEIH KVAVAFDRNS YAMAEKAFAK ALGALEESVY RSLTQSYRDK 

251 FLESERAKIP WWGHITWLRD DAKSGCAEKK LGMPRWVGRU LGKQSFG* 

The cp6266 nucleotide sequence <SEQ ID 300> is: 

1 ATGCTCTTAC TGATTTCAGG AGCTCTCTTT CTGACGTTAG GGATTCCAGG 

51 ATTGAGTGCA GCAATTTCTT TTGGATTAGG CATCGGTCTC TCCGCATTAG 

101 GAGGAGTGCT GATGATTTCG GGACTACTAT GTCTTTTAGT AAAACGAGAG 

151 ATTCCGACAG TACGACCAGA AGAAATTCCT GAAGGGGTTT CGCTGGCTCC 
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201 TTCTGAGGAG CCAGCTCTAC AGGCAGCTCA GAAGACTTTA GCTCAGCTGC 

251 CTAAGGAATT GGATCAGTTA GATACAGATA TTCAGGAAGT GTTCGCATGT 

3 01 TTAAGAAAGC TGAAAGATTC TAAGTATGAA AGTCGAAGTT TTTTAAACGA 

351 TGCTAAGAAG GAGCTTCGAG TTTTTGACTT TGTGGTTGAG GATACCCTCT 

5 401 CGGAGATTTT CGAGTTGCGG CAGATTGTGG CTCAAGAGGG ATGGGATTTA 

451 AACTTTTTGA TCAATGGGGG ACGAAGCCTC ATGATGACTG CAGAATCTGA 

501 ATCGCTTGAT TTGTTTCATG TATCGAAGCG GCTAGGGTAT TTACCTTCTG 

551 GGGATGTTCG AGGGGAGGGG TTAAAGAAAT CTGCGAAGGA GATAGTCGCT 

601 CGTTTGATGA GCTTGCATTG CGAGATTCAC AAGGTGGCGG TAGCGTTTGA 

10 651 TAGGAATTCC TATGCGATGG CAGAAAAGGC GTTTGCGAAA GCGTTGGGAG 

701 CTTTAGAAGA GAGTGTGTAT CGGAGTCTGA CGCAGAGTTA TAGAGATAAA 

751 TTTTTGGAGA GCGAGAGGGC GAAGATCCCA TGGAATGGGC ATATAACCTG 

801 GTTAAGAGAT GATGCGAAGA GTGGGTGTGC TGAAAAGAAG CTCGGGATGC 

851 CGAGGAACGT TGGAAGAAAT TFAGGAAAGC AGTCTTTTGG GTAG 

15 The PSORT algorithm predicts inner membrane (0.3590). 

The protein was expressed in Kcoli and purified as a his-tag product (Figure 150A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
150) and for FACS analysis. 

These experiments show that cp6266 is a surface-exposed and immunoaccessible protein and that 
20 they it is a useful immunogen. These properties are not evident from the sequence alone. 

Example 151 

The following ^pneumoniae protein (pid 43 76895) was expressed <SEQ ID 301; cp6895>: 

1 MKIKKSFOYS I»C Q AKRFQNM LPNHFDPCLQ PVNLQLKQDR LAYGELIILL 
51 SKYQQKTFSS LLKEETCSLN RAKQHLLYKI LRDFNTMQHL RSLGLNGWGE 
25 101 IPMSPCL* 

The cp6895 nucleotide sequence <SEQ ID 302> is: 

1 ATGAAGATTA AAAAATCTTT TCAATACAGT TTATGCCAAG CAAAGAGATT 

51 TCAGAACATG CTGCCAAACC ACTTTGATCC ATGTTTGCAG CCAGTGAATT 

101 TACAACTCAA ACAAGACAGA TTGGCATACG GGGAGCTCAT CATATTGCTA 

30 151 TCTAAATATC AACAAAAGAC CTTTTCCTCT TTGTTGAAGG AAGAAACATG 

201 TTCTCTTAAT CGTGCGAAGC AGCACTTATT GTATAAGATT TTGAGAGATT 

251 TTAATACTAT GCAGCATCTA AGGTCCCTCG GATTAAATGG TTGGGGAGAG 

301 ATCCCTATGA GTCCTTGCCT CTAA 

The PSORT algorithm predicts cytoplasm (0.3264). 

35 The protein was expressed in Kcoli and purified as a his-tag product (Figure 151 A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
15 IB) and for FACS analysis. 

These experiments show that cp6895 is a surface-exposed and immunoaccessible protein and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

40 Example 152 and 
Example 153 

The following C.pneumoniae protein (pid 437 62 82) was expressed <SEQ ID 303; cp6282>: 

1 MSLLNLPSSQ DSASEDSTSQ SQIFDPIRNR ELVSTPEEKV RQRLLSFLMH 

51 KLNYPKKLII IEKELKTLFP LLMRKGTLXP KRRPDILIIT PPTYTDAQGN 

45 101 THNLGDPKPL LLIECKALAV NQNAIjKQLIjS YNYSIGATCI AMAGKHSQVS 

151 ALFKPKTQTL DFYPQLPEYS QLL.NYFISLN L* 
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The cp6282 nucleotide sequence <SEQ ID 304> is: 

1 ATGTCCTTAT TGAACCTTCC CTCAAGCCAG GATTCTGCAT CTGAGGACTC 

51 CACATCGCAA TCTCAAATCT TCGATCCCAT TAGAAATCGG GAGTTAGTTT 

101 CTACTCCCGA AGAAAAAGTC CGCCAAAGGT TGCTCTCCTT CCTAATGCAT 

J 151 AAGCTGAACT ACCCTAAGAA ACTCATCATC ATAGAAAAAG AACTCAAAAC 

201 TCTTTTTCCT CTGCTTATGC GTAAAGGAAC CCTAATCCCA AAACGCCGCC 

251 CAGATATTCT CATC ATCACT CCCCCCACAT ACACAGACGC ACAGGGAAAC 

301 ACTCACAACC TAGGCGACCC AAAACCCCTG CTACTTATCG AATGTAAGGC 

351 CTTAGC CGTA AACCAAAATG CACTCAAACA ACTCCTTAGC TATAACTACT 

IU 401 CTATCGGAGC CACCTGCATT GCTATGGCAG GGAAACACTC TCAAGTGTCA 

451 GCTCTCTTCA ATCCAAAAAC ACAAACTCTT GATTTTTATC CTGGCCTCCC 

501 AGAGTATTCC CAACTCCTAA ACTAC TTT AT TTCTTTAAAC TTATAG 

The PSORT algorithm predicts cytoplasm (0362). 

The following C.pneurnoniae protein (pid 4377373) was also expressed <SEQ ID 305; cp7373>: 

15 1 MSTTTVKHFI HTASKWEPVL KEIVASNYWH AQWINTLSFL ENSGAKKISA 

51 SEHPTEVKEE VLKHAAEEFR HGHYLKTQIS RISETSLPDY TSKNLLGGLL 

101 TKYYLHLLDL RTCRVLENEY SLSGQTLKTA AYILVTYAIE IiRASELYPLY 

151 HDILKEAQSK XTVKSIILEE QGHLQEMERE LKDLPHGEEL LGYACQFEGE 

2 01 LCLQFVERLE QMIFDPSSTF TKF* 

20 The cp7373 nucleotide sequence <SEQ ID 306> is: 

1 ATGTCTACAA C CACAGT AAA ACACTTTATC CACACAGCCT CTCGTTGGGA 

51 GCCCGTTCTC AAAGAGATCG TAGCTTCCAA CTATTGGCAT GCACAATGGA 

101 TAAATACCCT GTCCTTTTTA GAAAATAGTG GAGCAAAAAA AATCTCCGCA 

151 AGTGAACATC CTACGGAGGT AAAGGAAGAA GTTTTAAAAC ATGCTGCTGA 

2 01 AGAATTTCGT CATGGTCACT ATCTAAAAAC TCAGATTTCT AGAATCT C AG 
251 AGACTTCTCT CCCTGACTAT ACATCTAAAA ATCTTCTGGG AGGCTTACTT 

3 01 ACAAAATATT ACCTCCATCT TCTAGATTTA AGGACGTGCC GAGTACTGGA 
3 51 AAATGAATAC TCCCTATCGG GACAAACGTT AAAAACTGCA GCGTATATTT 
401 TAGTTACCTA CGCAATCGAA CTTCGTGCTT CTGAACTTTA TCCTCTGTAT 
451 CACGATATTC TGAAAGAAGC TCAAAGTAAA ATAACGGTAA AATC CATTAT 
5 01 CTTAGAAGAG CAAGGCCATC TGCAAGAGAT GGAACGTGAA CTTAAAGATC 
551 TCCCCCACGG GGAGGAACTC TTAGGCTATG CTTGCCAATT CGAAGGGGAG 
601 CTTTGCTTGC AGTTTGTAGA GAGATTAGAA CAAATGATCT TCGATCCTTC 
651 CTCGACTTTT ACAAAGTTCT AG 

35 The PSORT algorithm predicts cytoplasm (0.1069). 

The proteins were expressed in Kcoli and purified as his-tag products (Figure 152A; 6282 = lanes 8 
& 9; 7373 = lanes 2-4). The recombinant proteins were used to immunise mice, whose sera were 
used in Western blots (Figures 152B & 153) and for FACS analysis. 

These experiments show that cp6282 & cp7373 are surface-exposed and immunoaccessible proteins 
40 and that they are useful imrnunogens. These properties are not evident from the sequence alone. 

Example 154 , 

Example 155 , 

Example 156 , 

Example 157 and 
45 Example 158 



25 20i 



30 451 



The following C.pneurnoniae protein (pid 43 76412) was expressed <SEQ ID 307; cp6412>: 

1 MSSSEWFQT VHGLGFGGLS SKSWPFKKS LSDAPRWCS ILVLTLGLGA 

51 LVCGIAITCW CVPGVILMGG ICAIVLGAIS LALSLFWLWG LFSWCCGSKR 

101 VLPGEGLLRD KLIiDGGFSRA APSGMGLPGD GSPRASTPSC LEELOAEI0A 

151 VTQAIDQMSD D* v^J-S^ 

The cp6412 nucleotide sequence <SEQ ID 308>is: 
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1 ATGAGCAGTT CGGAAGTTGT TTTCCAGACA GTTCATGGCC TTGGCTTTGG 

51 TGGATTGTCT TCAAAAAGTG TTGTCCCTTT TAAGAAAAGT CTTTCGGATG 

101 CGCCCCGTGT TGTGTGCTCG ATTTTAGTTT TGACTCTGGG GTTGGGAGCG 

151 CTTGTTTGTG GTATTGCCAT TACTTGTTGG TGTGTCCCGG GAGTTATTTT 

2 01 AATGGGGGGA ATTTGCGCTA TAGTTTTAGG TGCAATTTCT TTAGCTTTAA 
251 GTCTATTTTG GTTGTGGGGT TTATTTTCTA ATTGTTGTGG TTCTAAGAGA 

3 01 GTTTTACCGG GTGAGGGATT GCTACGGGAT AAGCTTTTAG ATGGTGGATT 
351 TTCAAGAGCG GCACCTTCAG GAATGGGACT TCCGGGTGAT GGATCTCCAA 

in 401 GAGCGTCAAC GCCATCTTGC CTAGAGGAAC TTCAAGCAGA GATACAGGCA 

1U 451 GTTACTCAAG CTATCGATCA GATGTCAGAT GATTGA 

The PSORT algorithm predicts inner membrane (0.4864). 

The following C.pneumoniae protein (pid 4376431) was also expressed <SEQ ID 309; cp6431>: 

1 LRAGG SIjVTT YPKEGQRLRS PEQLRVLDDL VQSYPNHLHA IELDCGAIPQ 
1K 51 DLIGATYIIT FADFSTYILS LRSYQANSPS DDTWGIWFGS IDDPVQAV1S 

L:> 101 FLKDHGFALP STLAQDPIiLC TNK* 

The cp6431 nucleotide sequence <SEQ ID 310> is: 

1 TTGCGAGCAG GAGGTAGTCT TGTTACAACA TACCCTAAGG AAGGTCAGAG 

51 ATTGCGCTCC CCAGAACAGT TAAGAGTTCT GGATGATTTA GTGCAAAGCT 

n 101 ATCCAAATCA CCTACATGCG ATTGAACTTG ATTGTGGTGC AATCCCTCAA 

2U 15 1 GATTTGATCG GAGCCACCTA TATCATCACG TTCGCCGATT TTTCCACCTA 

201 TATTCTCTCT TTAAGAAGCT ACCAAGCCAA TTCTCCCTCC GATGATACAT 

251 GGGGGATTTG GTTTGGATCT ATTGACGATC CTGTTCAAGC AGTCATATCA 

301 TTTTTAAAAG ATCATGGATT TGCTCTTCCC TCGACCTTAG CTCAAGATCC 

351 TTTGCTTTGT AC TAACAAGT AA 

25 The PSORT algorithm predicts cytoplasm (0.2115). 

The following C.pneumoniae protein (pid 437 6443) was also expressed <SEQ ED 3 1 1; cp6443>: 

1 MIMTTISNSP SPALNPELSL IPPPTIiVSSG TQTSLAYTIP AQGRRSTDRI 

51 ILDIFIIILG LATIISTFIV IFFLNGLNLL STPSIISSSC L I I VGLLFL I 

101 MGLYFMISSL DQGLVGLLQK ELSQAEEREE EYIQEIEALR GAPRAESPTE 

3U 151 SPSTWL* 

The cp6443 nucleotide sequence <SEQ ID 312> is: 

1 ATGATTATGA CTACTATATC TAACTCACCC TCCCCTGCAT TGAATCCCGA 

51 ACTTTCCCTT ATTCCTCCAC CAACACTTGT ATCTTCAGGT ACGCAAACAT 

101 CTCTAGCTTA TACGATCCCC GCACAAGGAC GAAGATCCAC CCTACGTATT 

35 151 ATATTAGATA TATTCATTAT CATTCTTGGT TTAGCTACGA TCATTTCTAC 

2 01 CTTTATTGTT ATTTTCTTTT TAAATGGGCT GAACTTGCTC TCGACCCCAT 
251 CTATTATCTC TTCGTCATGT TTAATCATTG TTGGATTGCT TTTTTTGATT 

3 01 ATGGGGTTAT ATTTCATGAT CTCGAGTTTG GATCAGGGGC TTGTAGGCCT 
351 TCTGCAAAAG GAACTCTCTC AAGCCGAAGA AAGAGAAGAA GAGTATATCC 

40 401 AGGAAAT CGA AGCTTTAAGA GGAGCTCCTA GAGCAGAATC TCCCACAGAG 

451 TCTCCTAGTA CCTGGTTATG A 

The PSORT algorithm predicts inner membrane (0.5585). 

The following C.pneumoniae protein (pid 437 649 6) was also expressed <SEQ ID 313; cp6496>: 

1 MLIGRYSSDD QFTEATKNTP TIIKLGFVRD NLEGLTNPIS EIVSETSSSI 
45 51 KDSVLRSLPI LGSILGCARL YSTLSTNDPL DETQEKIWHT IFGALETLGL 

101 GILILLFKII FVILHCIFHL VIGFCK* 

The cp6496 nucleotide sequence <SEQ ID 314> is: 

1 ATGCTAATAG GCAGATACAG TAGTGATGAC CAATTCACTG AAGCAACAAA 

51 AAACACCCCA ACCATAATTA AGCTAGGTTT TGTTAGAGAT AATCTCGAGG 

50 101 GATTAACGAA CCCTATCTCT GAAATCGTCT CGGAAACCTC CTCTTCTATT 

151 AAAGATTCCG TTCTTCGCTC TCTTCCTATT TTAGGGTCCA TTTTAGGATG 

201 GGCCCGACTT TACAGCACAC TCTCTACAAA TGATCCTCTT GACGAAACTC 

251 AAGAAAAGAT TTOGCACACT ATATTTGGAG CCTTAGAAAC CTTAGGCTTA 

301 GGGATTCTCA TCCTCTTATT TAAAATTATT TTTGTTATAT TACACTGCAT 

55 3 51 ATTTCATCTA GTTATTGGGT TCTGCAAATA A 
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The PSORT algorithm predicts inner membrane (0.5989), 

The following C.pneumoniae protein (pid 4376654) was also expressed <SEQ ID 315; cp6654>: 

1 MKTKMNSRKK AGQWAIFNSP TPGVSSTLVL AWT PWGYYDK DVQDILERKD 
51 PMSSSLSEKD SKEFLKNLFV DLLENGFTSV HIHAEEAFTP LDHTGKPHFK 
101 RDNVYLPGKL LGALNEAAVQ ANVSADTQFT LFLTQDECNP FHDKKRG* 

The cp6654 nucleotide sequence <SEQ ID 316> is: 

1 ATGAAAACTA AAATGAACTC TAGAAAAAAA GCAGGTCAAT GGGCAATTTT 

51 CAATTCTCCA ACTCCTGGTG TCAGTTCAAC TTTAGTTTTA GCATGGACTC 

101 CTTGGGGTTA TTACGACAAG GATGTACAAG ATATCTTAGA AAGAAAAGAT 

151 CCGATGAGCT CTTCGCTTTC TGAAAAAGAC TCAAAGGAGT TCTTGAAAAA 

201 TCTGTTTGTA GATCTCTTAG AAAATGGCTT CACATCAGTA CATATTCACG 

251 CAGAAGAAGC TTTCACTCCT C TTGATC ATA CCGGGAAACC TCACTTTAAA 

301 AGAGACAATG TGTACTTACC CGGAAAGTTG TTAGGCGCCT TGAATGAGGC 

351 TGCGGTACAA GCCAATGTAA GTGCGGATAC TCAATTTACA TTGTTCCTTA 

15 401 CTCAAGATGA GTGCAATCCT TTTCATGATA AGAAAAGAGG TTAA 

The PSORT algorithm predicts cytoplasm (0.0730). 

The proteins were expressed in E.coli and purified as his-tag products (Figure 154 A; 6412 = lanes 
2-3; 6431 = lanes 11-12; 6443 = lanes 5-6; 6496 = lanes 8-9; 6654 = lane 10; markers in lanes 1, 4, 
7). The recombinant proteins were used to immunise mice, whose sera were used in Western blots 
20 (Figures 154B, 155, 156, 157 & 158) and for FACS analysis. 

These experiments show that cp6412, cp6431, cp6443, cp6496 & cp6654 are surface-exposed and 
immunoaccessible proteins and that they are useful immunogens. These properties are not evident 
from their sequences alone. 

Example 159 and 
25 Example 160 

The following C.pneumoniae protein (pid 437 6477) was expressed <SEQ ID 317; cp6477>: 

1 liIiKFFLVCEE LCILTVATHR AX»LETPIjALS FFKELKTKYV YRAKDILQLH 
51 NYKGFTILNT SPLCS* 

The cp6477 nucleotide sequence <SEQ ID 318> is: 

30 1 TTGCTAAAGT TCTTTCTAGT ATGTGAAGAG TTATGTATAC TTACTGTTGC 

51 TACACATAGA GCTCTCTTAG AAACTCCTTT AGCTCTATCA TTTTTTAAAG 

101 AAC TTAAG AC AAAATATGTC TACAGGGCGA AAGACATACT ACAACTACAT 

151 AACTATAAAG G AT TT ACT AT CCTTAATACA TCACCGTTAT GTTCTTAA 

The PSORT algorithm predicts inner membrane (0.128). 
35 The following C.pneumoniae protein (pid 437643 5) was also expressed <SEQ ED 319; cp6435>: 

1 LWSHFPRGFF MLPFCPTILL AKPFLNSENY GLERLAATVD SYFDLGQSQI 
51 VFLSKQDQGI TVEELSAKDR KFKPGSMNCT JjYTEDPILPA HNSFSNCSDI 
101 QMRTPISPIH * 

The cp6435 nucleotide sequence <SEQ ID 320> is: 

40 1 TTGTGGTCGC ATTTCCCAAG AGGATTTTTT ATGCTCCCTT TTTGCCCTAC 

51 CATCCTTCTT GCTAAACCTT TTTTAAATAG CGAGAATTAC GGCTTAGAAC 

101 GTTTAGCTGC AACCGTAGAT TCTTATTTTG ATCTGGGACA GTCTCAAATA 

151 GTCTTCCTAA GCAAACAGGA TCAAGGAATC ACTGTGGAAG AATTGAGTGC 

201 TAAAGATAGG AAATTCAAGC CAGGCTCTAT GAACTGTACA CTGTACACTG 

45 251 AAGATCCTAT CTTACCTGCT CATAATTCCT TTAGTAATTG CTCTGATATT 

301 CAAATGCGTA CTCCGATTAG CCCTATACAT TAA 
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The PSORT algorithm predicts periplasmic space (0.4044). 

The proteins were expressed in Kcoli and purified as his-tag products (Figure 159A; 6435 = lanes 
2-4; 6477 * lanes 5-7). The recombinant proteins were used to immunise mice, whose sera were used 
in Western blots (Figures 159E & 160) and for FACS analysis. 

These experiments show that cp6477 & cp6435 are surface-exposed and immunoaccessible proteins 
and that they are useful immunogens. These properties are not evident from the sequences alone. 

Example 161 and 
Example 162 and 
Example 163 

The following C.pneumoniae protein (fid 4376441) was expressed <SEQ ID 321; cp6441>: 



1 VEAGANVLVI DTAHAHSKGV FQTVLEIKSQ FPQISLWGN LVTAEAAVSL 

51 AEIGVDAVKV GIGPGSICTT RIVSGVGYPQ ITAITNVAKA LKNSAVTVIA 

101 DGRIRYSGDV VKALAAGADC VMLGSLLAGT DEAPGDIVSI DEKLFKRYRG 

,r 151 MGSLGAMKQG SADRYFQTQG QKKLVPGGVE GLVAYKGSVH DVLYQILGGI 

iD 201 RSGMGYVGAE TliKDLKTKAS FVRITESGRA ESHIHNIYKV QPTLNY 

The cp6441 nucleotide sequence <SEQ ID 322> is: 



1 


GTGGAAGCTG 


51 


TAAAGGAGTA 


101 


TTTCTTTAGT 


151 


GCTGAGATTG 


201 


CTGTACAACT 


251 


TTACAAACGT 


301 


GATGGGAGAA 


351 


AGCAGACTGT 


401 


CTGGGGATAT 


451 


ATGGGATCTT 


501 


AACACAGGGA 


551 


CTTATAAAGG 


601 


CGCTCAGGTA 


651 


TAAGGCTTCC 


701 


TTCATAATAT 



The PSORT algorithm predicts bacterial inner membrane (0.132). 

The following C.pneumoniae protein (pid 4376748) was also expressed <SEQ ID 323; cp6748>: 

1 ^FSEGTALWL FRIFAPLRNR VTTEYSRARQ PDliHRIAIVY IGVLDSESSK 

51 ILERXjISYMS ciysesqmyl rffmgknvnq savlsklhve NLHIRCGFFS 
101 EDAVPESEPF DL S I YVHTDR SCPLPTRKRS SSWELQTVEL PESIYPOSEF 
151 LLMRPRMLS* 

The cp6748 nucleotide sequence <SEQ ID 324> is: 

i ttgttctctg aggggacagc tctaaattta totcgtatat ttgctccact 

4U 51 acgcaaccgt gtgactacag aatacagtcg tgctaggcaa cccgacctac 

101 atagaattgc catcgtctat ataggagttc tcgattcaga aagttccaag 

151 atcctagagc ggctaatctc ttatatgagt tgtatctatt ctgaatcgca 

2 01 aatgtattta agattcttta tgggcaagaa tgtaaatcaa agtgctgtac 

251 TCTCAAAATT ACATGTAGAA AATCTGCACA TCCGTTGTGG GTTTTTCAGC 

45 301 GAGGATGCTG TTCCAGAGAG TGAGCCCTTC GATCTCTCCA TCTACGTGCA 

351 CACAGATCGT AGCTGTCCTC TCCCTACGAA AAAACGGAGC AGCTCCTGGG 

401 AACTCCAAAC TGTAGAACTC CCAGAGTCAA TATATCCACA GTCGGAATTC 

451 CTATTGATGA GACCTCGAAT GCTTTCGTAG 



50 



The PSORT algorithm predicts cytoplasm (0.170). 
The following ^pneumoniae protein (pxd 4376881) was also expressed <SEQ ID 325; cp6S81>: 
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1 MRPHRKHVSS KSLALKQSAS THVEITTKAF RLSMPLKQLI kEKSDHLPPM 

51 ETIRWliTSH KDKLGTEVHV VASHGKEIIjQ TKVHNANPYT AVINAFKKIR 

101 TMANKHSNKR KDRTKHDLGL AAKEERIAIQ EEQEDRLSNE WLPVEGLDAW 

151 DSL.KTLGYVP ASAKKKISKK KMSIRMLSQD EAIRQUBSAA ENFLIFIiNEQ 

201 EHKIQCIYKK HDGNYYLIEP SLKPGFCI* 

The cp6881 nucleotide sequence <SEQ ID 326> is: 

1 ATGAGACCTC ATCGTAAACA CGTATCATCT AAAAGCTTAG CTTTAAAGCA 

51 ATCTGCATCA ACTCATGTAG AGATCACAAC AAAAGCCTTT CGTCTCTCTA 

101 TGCCTCTAAA ACAGCTGATC CTAGAGAAAA GCGACCACCT CCCCCCTATG 

151 GAAACAATCC GTGTGGTGCT AACCTCTCAT AAAGATAAGC TAGGCACCGA 

201 GGTGCATGTT GTAGCTTCTC ATGGCAAAGA AATCCTTCAA AC TAAGGTTC 

251 ATAACGCAAA CCCATACACT GCAGTGATCA ATGCTTTTAA GAAAATCCGC 

301 ACCATGGCAA ATAAGCACTC CAATAAACGT AAAGACAGGA CAAAACATGA 

351 TCTAGGTCTT GCAGCAAAAG AAGAACGTAT CGCAATACAG GAAGAACAAG 

401 AAGATCGCCT TAGCAACGAG TGGCTTCCTG TCGAAGGCCT CGATGCOTGG 

451 GATTCTCTAA AAAC TCTTGG GTATGTTCCC GCATCAGCGA AAAAGAAGAT 

501 CTCCAAGAAA AAGATGAGCA TTCGTATGCT ATCTCAAGAC GAGGCTATCC 

551 GCCAGCTAGA GTCTGCCGCA GAAAACTTCC TGATCTTC TT GAACGAGCAA 

601 GAGCATAAAA TCCAATGCAT TTATAAAAAA CATGACGGCA ACTATGTCCT 

651 TATTGAACCT TCCCTCAAGC CAGGATTCTG CATCTGA 

The PSORT algorithm predicts cytoplasm (0.249). 

The proteins were expressed in E.coli and purified as his-tag products (Figure 161 A; 6441= lanes 
7-9; 6748 = lanes 2-3; 6881 = lanes 4-6). The recombinant protein was used to immunise mice, 
whose sera were used in Western blots (Figures 161B, 162 & 163) and for FACS analysis. 

These experiments show that cp6441, cp6748 & cp6881 are surface-exposed and immunoaccessible 
proteins and that they are useful immunogens. These properties are not evident from the sequence 
alone. 



Example 164 and 
Example 165 
Example 166 

The following C.pneumoniae protein (pid 4376444) was expressed <SEQ ID 327; cp6444>: 

1 MEQPNCVIQD TTTVLYALNS FDPRL SDDTH RLGKQSPLEA ENALGEF X EG 
51 LDTNSFPLEE VAIPILPGYH PKFYLSFIDR DDQGVHYEVL DGVFLKTVAA 
101 CIIENSFLTD SMSPELLSEV KEALKR* 

The cp6444 nucleotide sequence <SEQ ID 328> is: 

1 ATGGAGCAAC CCAATTGTOT GATTCAGGAT ACTACAACTG TTTTGTATGC 

51 CTTAAATAGC TTTGATCCTA GACTTAGTGA TGACACTCAC AGACTTGGGA 

101 AGCAATCACC TCTTGAAGCA GAAAATGCTC TTGGAGAATT TATTGAAGGT 

151 TTGGATACAA ATAGCTTTCC TTTAGAGGAA GTTGCCATTC CCATCCTGCC 

201 AGGTTATCAC CCTAAGTTTT ATTTATCTTT CATAGATAGG GACGATCAAG 

251 GTGTCCACTA TGAAGTTTTA GATGGCGTAT TTTTAAAGAC AGTCGCTGCT 

301 TGTATTATAG AGAACTCCTT CTTAACTGAT TCTATGAGCC CGGAGCTTCT 

3 51 CAGCGAAGTT AAGGAAGCTC TGAAACGATG A 

The PSORT algorithm predicts cytoplasm (0.2031). 

The following C.pneumoniae protein (pid 43 7 6413) was also expressed <SEQ ID 329; cp6413>: 

1 MAVQSIKEAV TSAATSVGCV NCSREAIPAF NTEERATSIA RSVIAAIIAV 
51 VAISLLGLiGL WliAGCCPLG MAAGAITMLL GVALLAWAIL ITLRtiLNXPK 
101 AEIPSPGHNG EPNERNSATP PLEGGVAGEA GRGGGSPLTQ LDLNSGAGS* 

The cp6413 nucleotide sequence <SEQ ID 330> is: 

1 ATGGCTGTTC AATCTATAAA AGAAGC CGTA ACATCAGCCG CAACATCAGT 
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51 AGGATGTGTA AACTGTTCTA GAGAGGCTAT ACCAGCATTT AATACAGAGG 

101 AGAGAGCAAC GAGTATTGCT AGATCTGTTA TAGCAGCTAT CATTGCTGTT 

151 GTAGCTATCT CCTTACTCGG ACTAGGTCTT GTAGTTCTTG CTGGTTGCTG 

201 TCCTTTAGGA ATGGCTGCGG GTGCTATAAC AATGCTGCTG GGTGTAGCAT 

-> 251 TATTAGCTTG GGCAATACTG ATTACTTTGA GACTGCTTAA TATACCTAAG 

301 GCTGAAATAC CGAGTCCAGG GAACAACGGT GAGCC TAATG AAAGAAATTC 

351 AGCAACTCCT CCTCTAGAGG GTGGTGTTGC AGGAGAAGCC GGTCGCGGCG 

401 GGGGGTCACC TTTAACCCAA CTTGATCTCA ATTCAGGGGC GGGAAGTTAG 

The PSORT algorithm predicts inner membrane (0.6180). 
10 The following C.pneumoniae protein (pid 4377391) was also expressed <SEQ ID 331; cp739l>: 

1 MMLRVIELPL LPIKQALEKA FVQYNSYKAK LTKVEPCFRE SPAYITSEER 

51 LQSLDQTLER AYKEYQKRFQ EPSRLESEVS GCREHLREQV KQFETQGLDL 

101 IKEELIFVSD VLFRKMVSCL, VSTVHVPFME FYYEYFELHR LRLRAQWMAN 

1<; 151 AEIYSKVRKA FPEMLKETLE KAKAPREEEY WLLCEERKSK EKRLIL»NKIE 

^ 201 AAQQRVKDIjE pppiketgkq krkkeysffi rlks* 

The cp7391 nucleotide sequence <SEQ ID 332> is: 

1 ATGATGCTTC GTGTCATAGA GCTTCCACTA CTTCCTATAA AG C AAGCGTT 
51 GGAGAAGGCT TTTGTACAAT ATAATAGCTA CAAAGCGAAG TTAACCAAGG 
101 TAGAACCTTG CTTTAGAGAG AGCCCTGCCT AT AT AAC TAG CGAAGAGCGA 

Z{J 151 CTCCAGAGTT TGGATCAGAC TTTAGAACGT GCGTACAAAG AGTACCAGAA 

201 GAGATTCCAG GAGCCTTCAC GTTTGGAATC GGAAGTAAGT GGATGTAGAG 
251 AGCATCTTAG AGAGCAGGTA AAACAATTTG AAACTCAAGG ACTAGACTTG 
301 ATCAAAGAAG AGCTTATTTT TGTTAGTGAT GTGTTATTCC GAAAAATGGT 
351 CAGTTGTCTA GTGTCGACAG TGCATGTTCC CTTTATGGAG TTTTATTATG 

25 401 AGTATTTTGA GTTGCATAGA TTGAGGTTGC GGGCCCAATG GATGGCGAAT 

451 GCCGAGATTT ATAGCAAAGT TAGAAAAGCA TTCCCAGAGA TGTTGAAGGA 
501 GACCTTAGAA AAAGCTAAGG CTCCCAGAGA AGAAGAGTAT TGGTTACTTT 
551 GCGAGGAGAG AAAGAGTAAG GAGAAGCGTT TGATTCTCAA CAAGATAGAG 
601 GCAGCTCAGC AGCGGGTAAA AGATTTAGAA CCTCCTCCTA TTAAAGAGAC 

30 651 AGGGAAACAG AAACGGAAGA AAGAATATTC GTTTTTCATT CGATTAAAAT 

701 CGTGA 

The PSORT algorithm predicts inner membrane (0.1489). 

The proteins were expressed in Rcoli and purified as his-tag and GST-fusion products (Figure 164A; 
6444=lanes 11-12; 7391=lanes 2-3; 6413=lanes 4-6). The recombinant protein was used to immunise 
35 mice, whose sera were used in Western blots (Figures 164B, 165 & 166) and for FACS analysis. 

These experiments show that cp6444, cp6413 & cp7391 are surface-exposed and immunoaccessible 
proteins and that they are useful immunogens. These properties are not evident from the sequence 
alone. 



Example 167 , 
40 Example 168 , 

Example 169 and 
Example 170 

The following C.pneumoniae protein (pid 4376463) was expressed <SEQ ID 333; cp6463>: 

1 MKKKVTIDEA LKEIDRLEGA ATQEELCAKL LAQGFATTQS SVSRWLRKIQ 
45 51 AVKVAGERGA RYSLPSSTEK TTTRHLVLSI RHNASLIVIR TVPGSASWIA 

101 ALLDQGLKDE ILGTLAGDDT IPVTPIDEGR LPLLMVSIAN LLQVFLD* 

The cp6463 nucleotide sequence <SEQ ID 334> is: 

1 ATGAAAAAAA AAGTAACTAT AGATGAGGCT TTAAAAGAAA TTTTACGTCT 

51 TGAAGGAGCG GCAACTCAGG AGGAATTATG TGCAAAACTC TTAGCTCAAG 

50 101 GTTTTGCTAC AACCCAGTCG TCTGTATCTC GTTGGCTACG AAAGATTCAG 

151 GC TGTAAAGG TTGCTGGAGA GCGTGGTGCT CGTTATTCTT TACCCTCTTC 
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201 AACAGAGAAG ACCACGACCC GTCATTTGGT GCTCTCTATT CGCCATAACG 

251 CCTCTCTTAT TGTAATTCGT ACGGTTCCTG GTTCAGCTTC TTGGATCGCT 

301 GCTTTGTTAG ATCAAGGGCT CAAAGATGAA ATTCTTGGAA CTTTGGCAGG 

351 AGATGACACG ATTTTTGTCA CTCCTATAGA TGAAGGGAGG CTCCCATTGT 

401 TGATGGTTTC GATTGCAAAT TTACTGCAAG TTTTCTTGGA TTAA 

The PSORT algorithm predicts inner membrane (0.1510). 

The following ^pneumoniae protein (pid 437 6540) was also expressed <SEQ ID 335; cp6540>: 

1 MSQCQSSSTS TWEWMKSFVP NWKNPTPPLS PIPSEDEFIL AYEPFVLPKT 
51 DPENAQANPP GTSTPNVEWG IDDLNPLLGQ PWEQNNAWNP GTSGSNPTSL 
101 PAPERLPETE ENSQEEEQGS QNNEDLIG* 

The cp6540 nucleotide sequence <SEQ ID 336> is: 

1 ATGTCTCAAT GTCAGAGTAG CAGTACATCT ACCTGGGAAT GGATGAAATC 

51 TTTTGTGCCA AACTGGAAGA ATCCAACTCC CCCCTTATCT CCTATACCTT 

101 CTGAGGACGA ATTTATATTA GCATACGAGC CATTTGTTCT ACCGAAAACA 

15 151 GATCCAGAAA ACGCACAAGC TAATCCTCCA GGCACATCTA CACCGAATGT 

201 AGAAAACGGG ATCGATGATC TCAACCCTCT TCTGGGGCAA CCCAACGAAC 

251 AAAACAATGC CAACAATCCA GGAACTTCTG GATCTAATCC TACATCTCTA 

301 CCCGCCCCCG AACGACTCCC TGAAACTGAA GAGAACAGCC AAGAAGAAGA 

351 ACAAGGATCT CAAAATAATG AGGATCTTAT AGGATAA 

20 The PSORT algorithm predicts cytoplasm (0.3086). 



10 



The following ^pneumoniae protein (pid 437 6743) was also expressed <SEQ ID 337; cp6743>: 

1 LREEG SVSFR EYFRAYMCDK IVAQKNFLPT LDAVIKQAGW RSQEKLNLFY 
51 VESQALGREI KVSLEEYIQS MVGILGSQRT KKSFKFSVDF TPLEQALQER 
101 CSSDDDEDAT ATSTATGATA SPTHMHEDE* 

25 The cp6743 nucleotide sequence <SEQ ID 338> is: 

1 TTGAGAGAAG AAGGTAGTGT TTCTTTCAGA GAATATTTCA GAGCCTATAT 

51 GTGTGATAAA ATCGTGGCAC AGAAGAACTT CTTATTTACT TTAGACGCTG 

101 TAATTAAACA GGCCGGTTGG AGATCACAAG AGAAACTCAA TTTATTTTAT 

151 GTTGAAAGTC AGGCTTTAGG AAGAGAAATC AAAGTCAGCT TAGAGGAATA 

M 201 TATTCAGAGT ATGGTCGGGA TTTTGGGATC TCAGAGAACC AAGAAAAGCT 

251 TTAAGTTTTC TGTCGACTTT ACCCCTTTAG AGCAGGCTCT ACAAGAAAGA 

301 TGCTCTTCTG ATGATGACGA AGATGCAACA GCAACTTCGA CCGCTACAGG 

351 GGCAACAGCA TCTCCGACTG ACATGCACGA AGATGAGTAA 

The PSORT algorithm predicts cytoplasm (0,2769). 



35 The following C.pneumoniae protein (pid 4377041) was also expressed <SEQ ID 339; cp7041>: 

1 MLMMLMMIIG ITGGSGAGKT TLTQNIKEIF GEDVSVTCQD NYYKDRSHYT 

51 PEERANL I WD HPDAFDNDLL ISDIKRLKNN EIVQAPVFDF VL.GNRSKTEI 

101 ETIYPSKVIL VEGILVFENQ ELRDLMDIRI FVDTDADERI LRRMVRDVQE 

151 QGDSVDCIMS RYLSMVKPMH EKFIEPTRKY AD X I VHGNYR QWWTNILSO 

40 201 KIKNHLENAL ESDETYYMVN SK* 

The cp7041 nucleotide sequence <SEQ ID 340> is: 

1 ATGTTGATGA TGCTTATGAT GATTATTGGA ATTACAGGAG GTTCTGGAGC 

51 TGGGAAAACC ACCCTAACCC AAAACATTAA AGAAATTTTC GGTGAGGATG 

101 TGAGTGTTAT CTGCCAAGAT AATTATTACA AAGATAGATC TCATTATACT 

45 151 CCTGAAGAAC GTGCCAATTT AATTTGGGAT CATCCGGACG CCTTTGATAA 

2 01 TGACTTATTA ATTTCAGACA TAAAACGTCT AAAAAATAAT GAGATTGTCC 
251 AAGCCCCAGT TTTTGATTTT GTTTTAGGTA ATCGATCTAA AACGGAGATA 

3 01 GAAACGATCT ATCCATCTAA AGTTATTCTT GTTGAAGGTA TTCTGGTCTT 
351 TGAAAATCAA GAACTTAGAG ATCTTATGGA TATTAGGATC TTTGTAGACA 

50 401 CCGATGCTGA TGAAAGGATA CTACGCCGTA TGGTTCGAGA TGTTCAAGAA 

451 CAAGGAGATA GCGTGGACTG CATCATGTCT CGTTATCTTT CTATGGTAAA 

501 GCCTATGCAT GAGAAATTTA TAGAGCCGAC TCGGAAATAT GCTGATATCA 

551 TTGTACATGG AAATTACCGA CAAAACGTAG TAACAAATAT TTTGTCACAG 

601 AAAATTAAAA ATCATTTAGA GAATGCCCTG GAAAGCGATG AGACGTATTA 

55 651 TATGGTCAAC TCTAAGTAA 
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The PSORT algorithm predicts inner membrane (0.1022). 

The proteins were expressed in E.coli and purified as his-tag products (Figure 167A; 6463 = lanes 
2-4; 6540 = lanes 5-7; 6743 = lanes 8-9; 7041 = lanes 10-1 1). The recombinant proteins were used to 
immunise mice, whose sera were used in Western blots (Figures 167B, 168, 169 & 170) and for 
5 FACS analysis. 

These experiments show that cp6463, cp6540, cp6743 & cp7041 are surface-exposed and 
immunoaccessible proteins and that they are useful immunogens. These properties are not evident 
from the sequence alone. 

Example 171 and 
10 Example 172 and 
Example 173 

The following Cpneumoniae protein (pid 4376632) was expressed <SEQ ID 341; cp6632>: 

1 VQLFQYMNES GWDWLCDFDS QGEGFQLSRL VGLLHSSWAL YEAKEQFYIiP 
51 EVSLLTWEEL IEMQLLSKPT KHGVAKDLCN VFEKHFQRFR QYLGSIiDLNQ 
I- 5 101 RFENTFLNYP KYHLDRE* 

The cp6632 nucleotide sequence <SEQ ED 342> is: 

1 GTGCAATTAT TTCAATATAT GAATGAGTCC GGATGGGATT GGCTTTGTGA 
51 TTTTGATTCT CAAGGCGAGG GATTCCAGTT ATCACGTCTG GTTGGGCTGT 
101 TACATTCGTC CTGGGCATTA TACGAAGCAA AAGAGCAATT TTACCTTCCT 
4V 151 GAGGTTTCTC TATTGACCTG GGAAGAACTG ATAGAAATGC AGTTATTAAG 

201 CAAACCAACA AAACACGGGG TTGCAAAAGA TCTTTGTAAT GTATTTGAAA 
251 AACACTTTCA AAGGTTTAGA CAGTACCTAG GTTC CTTAGA TCTAAATCAA 
301 AGGTTCGAAA AT AC CTTCTT GAATTATC CT AAATAC C ATT TAGATAGGGA 
351 GTGA 

25 The PSORT algorithm predicts cytoplasm (0.3627). 

The following Cpneumoniae protein (pid 437 6 64 8) was also expressed <SEQ ID 343; cp6648>: 

1 MPVSSAPLPT SHRPSSGNLG LMEPNSKALK AKHQDKTTKT IKLLVKIItVA 
51 ILVIEVLGII AAFFIPGTPP ICLIILGGLI LTTVLCVLLL VIKIALVNKT 
101 EGTTAEQQIK RKLSSKSIS* 

30 The cp6648 nucleotide sequence <SEQ ED 344> is: 

1 ATGCCCGTGT CCTCAGCCCC CCTACCCACA AGCCACCGCC CTTCCTCTGG 

51 AAATC TAGGC CTCATGGAAC CAAATTCCAA AGC TCTAAAA GCAAAGCATC 

101 AAGATAAAAC GACGAAGACG ATTAAACTTT TAGTTAAAAT CCTTGTTGCC 

151 ATTCTAGTAA TAGAAGTTTT AGGAATAATT GCAGCTTTCT TTATTCCTGG 

35 201 GACTCCTCCC ATCTGCTTGA TTATCCTAGG AGGC CTTATT CTTACAACAG 

251 TAC TCTGTGT GCTTCTTCTT GTTATAAAGC TTGCCCTTGT AAACAAAACC 

301 GAAGGAACAA C TGCTGAAC A GCAGATAAAA CGTAAACTCT CTTCTAAAAG 

351 TATTTCTTAG 

The PSORT algorithm predicts inner membrane (0.6074). 
40 The following Cpneumoniae protein (pid 4376497) was also expressed <SEQ ID 345; cp6497>: 

1 MKPNSIIFLE NTKHYPDIFR EGFVRDRHGL MEASDWLLST EITXIRSILG 
51 AIPILGNILG AGRLYSVWYT SDEDWKKQW * 

The cp6497 nucleotide sequence <SEQ ID 346> is: 

1 ATGAAGCCAA ATAGTATTAT TTTTTTAGAA AATACTAAGC ATTATCCCGA 

45 51 CATCTTTCGA GAAGGATTTG TTCGTGATCG TCATGGACTA ATGGAAGCCT 

101 CGGATTGGTT ACTTTCTACG GAAATTACGA TCATTCGOTC CATTCTGGGA 

151 GCTATCCCTA TTTTAGGAAA TATTC TTGGA GCCGGACGAC TCTATAGCGT 
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2 01 TTGGTATACA AGTGACGAAG ATTGGAAAAA ACAAGTGGTT TGA 

The PSORT algorithm predicts inner membrane (0.145). 

The proteins were expressed in Kcoli and purified as his-tag products (Figure 171A; 6632 = lanes 
5-7; 6648 = lanes 8-10; 6497 = lanes 2-4). The recombinant proteins were used to immunise mice, 
5 whose sera were used in Western blots (Figures 171B, 172, 173) and for FACS analysis. 

These experiments show that c P 6632, cp6648 and cp6497 are surface-exposed and 
immunoaccessible proteins and that they are useful immunogens. These properties are not evident 
from the sequence alone. 

Example 174 , 

10 Example 175 , 

Example 176 , 

Example 177 and 
Example 178 
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The following C.pneumoniae protein (pid 4377200) was expressed <SEQ ID 347; cp7200>: 

1 MPVPIDWSSR l^QEVPESLE DLEQHAEESP THQSAESSSI* QLSLASSAIS 

51 SRVEQLSSLV LGMENSDFSS LRDVPIFSAI YESSTHTPVP TPLVGVGYIN 

101 GSQSGYYDTQ RESLHLSQLL GSRRVEWYN QGWFMEASLL NLCPRRPRRD 

151 PSPISIiALLE LWEAFHLEHP PGSTFWPIFF W* 

The cp7200 nucleotide sequence <SEQ ID 348> is: 



1 ATGCCCGTTC CTATAGATAA TTCCTCTCGC AACCTACAAG AAGTTCCAGA 

51 AAGCCTAGAA GACCTCGAAC AACACGCAGA AGAATCTCCT ACTCATCAAA 

101 GTGCAGAAAG CAGTTCTTTG CAACTGTCTC TAGCCTCCTC AGCAATTTCT 

151 AGTAGAGTAG AACAACTATC TTCCCTCGTC TTAGGAATGG AAAATTCAGA 

9<; 201 TTTCTCCTGT TYAAGAGACG TTCCTATCTT CTCAGCTATC TACGAATCTT 

ZZ> 251 CAACACACAC ACCTGTCCCC ACTCCTCTAG TTGGCGTGGG ATATATCAAC 

301 GGAAGTCAAT CAGGATACTA CGATACACAA AGAGAATCTC TTCACCTCAG 

351 CCAATTGTTA GGAAGCCGAA GAGTTGAAGT TGTCTATAAC CAAGGAAACT 

401 TCATGGAGGC CTCTTTGCTA AATCTGTGCC CCAGAAGACC TCGAAGAGAT 

o n 451 CCCTCTCCAA TTTCTTTAGC TCTATTAGAG CTCTGGGAAG CATTTTTTTT 

3U 501 AGAACACCCC CCAGGTAGCA CTTTTAATCC AATATTTTTT TGGTAA 

The PSORT algorithm predicts cytoplasm (0.3672). 

The following Cpneumoniae protein (pid 4377235) was also expressed <SEQ ID 349; cp7235>: 

1 LNFVSTLTGS DFYAPVLEKXi EEAFADTTGQ VILFSSSPDF IVHPIAQQJbG 
,c 51 ISSWYASCYR DQSAEQTIYK KCLTGDKKAQ XLSYIKKINQ ARSHTFSDHI 

Dp 101 ldlpflmlge ektwrpqgr lkkmakkyyw niv* 

The cp7235 nucleotide sequence <SEQ ID 350> is: 

l ttgaattttg tatcgactct gaccggctcc gatttttatg ctcctgtttt 
51 agaaaaacta gaagaagctt ttgcagatac cacaggacag gtgatccttt 

101 TTTCTTCTTC TCCAGACTTT ATTGTCCACC CCATAGCGCA GCAACTCGGG 

W HI ATTAGTTCTT GGTATGCGTC GTGTTATCGC GATCAGTCTG CAGAACAGAC 

201 GATCTATAAA AAATGTCTTA CAGGGGATAA AAAAGCGCAA ATTTTGAGTT 

251 ATATTAAAAA AATTAATCAA GCAAGAAG C C ATACCTTCTC CGACCATATT 

301 TTAGATCTTC CTTTTCTTAT GCTGGGAGAA GAGAAAACCG TCGTTCGCCC 

45 TCAGGGACGA CTCAAGAAAA TGGCAAAAAA ATATTACTGG AATATCGTTT 

The PSORT algorithm predicts cytoplasm (0.3214). 

The following Cpneumoniae protein (pid 4377268) was also expressed <SEQ ID 351; cp7268>: 
1 mmhryfipll allifspslv raelqpsenr kggwptqlsc aegsqlfckf 
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51 EAAYNNAIEE GKPGILVFFS ERPTPEFADL TNGSFSLSTP IAKGFNVWL 
101 CPGLISPLDF FHKMDPVILY MGSFLEMFPE VEAVSGPRLC YILIDEQGGA 
151 QCQAVIiPLET KW* 
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The cp7268 nucleotide sequence <SEQ ID 352> is: 

1 ATGATGCACC GTTATTTTAT TCCTTTATTA GCACTTCTCA TTTTCTCTCC 

51 TTCTTTAGTC AGGGCAGAGC TACAACCAAG TGAAAACAGA AAAGGGGGGT 

101 GGCCTACACA ACTTTCCTGT GCAGAAGGTT CGCAACTCTT CTGTAAATTC 

151 GAAGCTGCCT ATAATAATGC AATTGAGGAA GGGAAACCTG GGATTTTAGT 

201 CTTTTTCTCT GAGCGACCCA C AC C AGAATT TGCCGACTTA ACGAATGGTT 

251 CATTTTCTCT CTCTACGCCA ATCGCCAAGG GCTTTAATGT CGTTGTGTTA 

301 TGCCCCGGGC TTATCAGTCC CTTAGACTTT TTCCACAAAA TGGATCCTGT 

351 GATTCTCTAT ATGGGAAGTT TTCTAGAGAT GTTCCCTGAA GTGGAGGCAG 

401 TTAGTGGCCC TCGCTTATGT TATATCTTAA TAGATGAACA GGGTGGGGCT 

451 CAATGTCAGG CTGTCCTGCC TTTAGAAACA AAGAATTAG 

15 The PSORT algorithm predicts inner membrane (0.1235). 

The following Cpneumoniae protein (pid 4377375) was also expressed <SEQ ID 353; cp7375> : 



20 



25 



30 



35 



l 

51 
101 
151 
201 



MQRIIIVGID TGVGKTIVSA I L ARAIiNAE Y WKPIQAGNLE NSDSHIVHEL 
SGAYCHPEAY RLHKPLSPHK AAQIDNVSIE ESHICAPKTT SNLIIETSGG 
FLSPCTSKRL QGDVFSSWSC SWILVSQAYL GSINHTCLTV EAMRSRNIiNI 
LGMVVNGYPE DEEHWLTQEI KLPIIGTLAK EKEITKTIIS CYAEQWKEVW 
TSNHQGIQGV SGTPSLNLH* 



The cp7375 nucleotide sequence <SEQ ID 354> is: 



l 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 



ATGCAACGTA TCATCATTGT AGGAATCGAC ACTGGCGTAG GAAAAACCAT 
TGTCAGTGCT ATCCTTGCTA GAGCACTTAA CGCAGAATAC TGGAAAC CTA 
TACAAGCAGG GAATCTAGAA AATTCAGATA GCAATATTGT TCATGAGCTA 
CGAAGCTTAT CGATTGCATA AGCCCTTGTC 
TCGATAATGT AAGTAT CGAA GAGAGTCATA 
TCGAATCTGA TTATTGAGAC TTCAGGAGGA 
AAAAAGACTT CAGGGAGATG TGTTTTCTTC 
TAGTGAGCCA AGCATATCTC GGAAGTATCA 
GAAGCAATGC GCTCACGAAA CCTCAATATC 
GTATCCAGAG GACGAAGAGC ACTGGCTAAC 
TAATCGGGAC TCTTGCCAAG GAAAAAGAAA 
TGTTATGCCG AACAATGGAA GGAAGTATGG 
ACAAGCAATC ATCAGGGAAT TCAGGGTGTA TCTGGCACCC CTTCACTCAA 
TCTGCATTAG 



TCGGGAGCCT ACTGTC ATC C 
TCCACACAAG GCAGCGCAAA 
TTTGTGCGCC AAAAACAACT 
TTTTTATCCC CCTGCACATC 
TTGGTCATGT TCTTGGATTT 
ATCACACCTG TTTAACGGTA 
TTAGGTATGG TGGTAAATGG 
TCAAGAAATC AAGCTTCCTA 
TCACAAAGAC AATCATAAGC 



The PSORT algorithm predicts cytoplasm (0.0049). 

The following ^pneumoniae protein (pid 43773 88) was also expressed <SEQ ID 355; cp7388>: 



40 



45 



50 



55 



l 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 



MQVXiLSPQLP PPPQHSVGSI SSPSKIiRVLA ITFLVFGMLL LISGALFLTL 
GIPGLSAAIS FGLGIGLSAL GGVLMISGLL CLLVKREIPT VRPEEIPEGV 
SLAPSEEPAL QAAQKTtiAQL PKELDQLDTD IQEVFACLRK LKDSKYESRS 
FLNDAKKELR VFDFWEDTL SEIFEL.RQIV AQEGWDLNFL INGGRSLMMT 
AESESLDLFH VSKRLGYLPS GDVRGEGLKK SAKE I VARLM SLHCEIHKVA 
VAFDRNSYAM AEKAFAKALG ALEESVYRSL TQSYRDKFLE SERAKIPWNG 
HITWLRDDAK SGCAEKKLRD AEERWKKFRK AVFWVEEDGG FDINNLLGDW 
GTVLDPYRQE RMDEITFHEL YEKTTFLKRL HRKCALAKTT FEKKRSKKNL* 
QAVEEANARR LKYVRDWYDQ EFQKAGERLE KLHALYPEVS VSIRENKIQE 
TRSNLEKAYE AIEENYRCCV REQEDYWKEE EKREAEFRER GNKILSPEEL 
ESSLEQFDHG LKNFSEKLME LEGHILKLQK EATAEVENKI LSDAESRLEI 
VFEDVKEMPC RIEEIEKTLR MAELPLLPTK KAFEKACSQY NSCAEMLEKV 
KPYCKESLAY VTSKERLVSL DEDLRRAYTE CQKRFQGDSG LESEVRACRE 
QLRERIQEFE TQGLDLVEKE LLCVSSRLRN TECDCVSGVK KEAPPGKKFY 
AQYYDEIYRV RVQSRWMTMS ERLREGVQAC NKMLKAGLSE EDKVLKEEEY 
WLYREERKNK EKRLVGTKIV AT QQRVAAFE SIEVPEIPEA PEEKPSLLDK 
ARSLFTREDH T 



The cp7388 nucleotide sequence <SEQ ID 356> is: 



i 

51 



ATGCAAGTAC TTCTATCTCC GCAGCTACCC CCCCCCCCCC AACACTCTGT 
AGGGTCGATT TCTTCTCCAT CTAAACTTCG CGTTTTAGCG ATTACTTTTT 
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ioi TAGTTTTTGG TATGCTCTTA CTGATTTCAG GAGCTCTCTT TCTGACGTTA 
151 GGGATTCCAG GATTGAGTGC AGCAATTTCT TTTGGATTAG GCATCGGTCT 
201 CTCCGCATTA GGAGGAGTGC TGATGATTTC GGGACTACTA TGTCTTTTAG 
251 TAAAACGAGA GATT CCGACA GTACGACCAG AAGAAATTCC TGAAGGGGTT 
301 TCGCTGGCTC CTTCTGAGGA GCCAGCTCTA CAGGCAGCTC AGAAGACTTT 
351 AGCTCAGCTG CCTAAGGAAT TGGATCAGTT AGATACAGAT ATTCAGGAAG 
401 TGTTCGCATG TTTAAGAAAG CTGAAAGATT CTAAGTATGA AAGTCGAAGT 
451 TTTTTAAACG ATGCTAAGAA GGAGCTTCGA GTTTTTGACT TTGTGGTTGA 
in 501 GGATACCCTC TCGGAGATTT TCGAGTTGCG GCAGATTGTG GCTCAAGAGG 
iU 551 GATGGGATTT AAACTTTTTG ATCAATGGGG GACGAAGCCT CATGATGACT 
601 GCAGAATCTG AATCGCTTGA TTTGTTTCAT GTATCGAAGC GGCTAGGGTA 
651 TTTACCTTCT GGGGATGTTC GAGGGGAGGG GTTAAAGAAA TCTGCGAAGG 
701 AGATAGTCGC TCGTTTGATG AGCTTGCATT GCGAGATTCA CAAGGTGGCG 
ls 751 GTAGCGTTTG ATAGGAATTC CTATGCGATG GCAGAAAAGG CGTTTGCGAA 
lD 801 AGCGTTGGGA GCTTTAGAAG AGAGTGTGTA TCGGAGTCTG ACGCAGAGTT 
851 ATAGAGATAA AOTTTTGGAG AGCGAGAGGG CGAAGATCCC ATGGAATGGG 
901 CATATAACCT GGTTAAGAGA TGATGCGAAG AGTGGGTGTG CTGAAAAGAA 
951 GCTTCGGGAT GCCGAGGAAC GTTGGAAGAA ATTTAGGAAA GCAGTCTTTT 
1001 GGGTAGAAGA AGACGGGGGC TTTGACATCA ATAATCTCCT TGGAGACTGG 
^ U 1051 GGGACAGTGC TTGATCCTTA TAGACAAGAG AGAATGGACG AGATAACGTT 
1101 CCATGAGTTG TATGAAAAAA CTACGTTTTT GAAAAGACTG CACAGAAAGT 
1151 GTGCGTTAGC GAAAACAACC TTTGAAAAGA AGAGATCTAA AAAGAATTTG 
1201 CAGGCAGTCG AGGAGGCGAA TGCACGTAGG TTGAAATATG TAAGGGATTG 
0 ~ 1251 GTATGATCAG GAGTTTCAGA AAGCAGGGGA GAGATTAGAG AAACTGCATG 
Z ° 1301 CTTTGTATCC TGAGGTTTCA GTCTCTATAA GAGAGAACAA AATACAAGAG 
1351 ACGCGCTCTA ATTTAGAGAA AGC CTATGAG GCTATCGAAG AGAACTATCG 
1401 TTGCTGTGTC CGAGAGCAAG AGGACTAC TG GAAAGAAGAA GAGAAAAGGG 
1451 AAGCGGAGTT TAGGGAGAGG GGAAACAAGA TTCTTTC TCC TGAGGAGCTG 
1501 GAAAGTTCTT TGGAGCAATT CGACCATGGT TTGAAAAATT TTTCTGAGAA 
M 1551 ATTAATGGAA TTGGAAGGGC ATATCTTAAA ACTTCAGAAA GAAGCCACAG 
1601 CAGAGGTGGA GAATAAAATA CTTTCAGATG CAGAGAGCCG CCTTGAGATT 
1651 GTATTTGAAG ATGTCAAGGA GATGCCCTGT CGAATTGAGG AGATAGAGAA 
1701 GACGCTGCGT ATGGCGGAGC TGCCCCTACT TCCTACGAAG AAGGCGTTTG 
1751 AGAAGGCCTG CTCACAATAT AATAGCTGCG CAGAGATGTT GGAGAAGGTG 
■ 3D I 801 AAGCCTTACT GCAAGGAGAG CCTCGCCTAT GTGACTAGCA AAGAGCGTTT 
1851 AGTGAGCTTG GATGAAGATT TACGACGAGC CTACACAGAG TGTCAGAAGA 
1901 GATTCCAGGG GGATTCGGGT TTGGAGTCGG AAGTAAGAGC CTGTCGAGAG 
1951 CAACTGCGAG AGCGGATCCA AGAGTTTGAA ACTCAAGGGC TGGACTTGGT 
2001 GGAAAAAGAG TTGCTTTGTG TGAGTAGTAG ATTAAGAAAT ACAGAGTGCG 
4U 205 1 ATTGTGTATC TGGTGTTAAG AAAGAAGCAC CTCCTGGTAA GAAGTTTTAT 
2101 GCCCAGTATT ATGATGAGAT TTATCGAGTT AGAGTTCAAT CCCGATGGAT 
2151 GACGATGTCT GAGAGATTGA GAGAGGGAGT TCAAGCATGC AACAAGATGT 
2201 TGAAGGCAGG CCTAAGCGAA GAAGATAAGG TTCTTAAAGA AGAAGAGTAT 
2251 TGGTTGTATC GAGAGGAGAG AAAGAATAAA GAGAAACGTT TGGTTGGTAC 
45 2301 TAAGATAGTA GCAACGCAGC AGCGAGTTGC AGCATTTGAA TCCATAGAAG 
2351 TTCCTGAGAT TCCTGAGGCC CCAGAGGAGA AACCGAGTTT GCTGGATAAA 
2401 GCGCGTTCTT TATTTACTCG CGAGGACCAT AC C TAG 

The PSORT algorithm predicts inner membrane (0.461). 

The proteins were expressed in Kcoli and purified as his-tag products (Figure 174: 720O=:lanes 2-3; 
50 7236=lanes 4-5; 7268=lanes 6-8; 7375=lanes 9-10; 7388=lanes 11-12). The recombinant proteins 
were used to immunise mice, whose sera were used in Western blots (Figures 174, 175, 176, 177 & 
178) and for FACS analysis. 

These experiments show that cp7200, cp7235, c P 7268, cp7375 & cp7388 are surface-exposed and 
immunoaccessible proteins and that they are useful immunogens. These properties are not evident 
55 from the sequence alone. 

Example 179 

The following C.pneumoniae protein (pid 4376723) was expressed <SEQ ID 357; cp6723>: 
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1 MATSVAPSPV PESSPLSHAT EVLNLPNAYI TQPHPIPAAP WETFRSKLST 

51 KHTLCFALTL LLTLGGTISA GYAGYTGNWI ICGIGLGIIV LTLILALLLiA 

101 XPLKNKQTGT KLIDEISQDI SSIGSGFVQR YGLMFSTIKS VHLPELTTQN 

151 QEKTRILNEI EAKKESIQNL ELKITECQNK LAQKQPKRKS SQKSFMRSIK 

*> 201 HLSKNPVILF DC* 

The cp6723 nucleotide sequence <SEQ ID 358> is: 

1 ATGGCAACTT CCGTAGCCCC ATCACCAGTC CCCGAGAGCA GCCCTCTCTC 

51 TCATGCTACA GAAGTTCTCA ATCTTCCTAA TGCTTATATT ACGCAGCCTC 

101 ATCCGATTCC AGCGGCTCCT TGGGAGACCT TTCGCTCCAA ACTTTCCACA 

1U 151 AAGCATACGC TCTGTTTTGC CTTAACACTA CTGTTAACCT TAGGGGGAAC 

201 GATCTCAGCA GGTTACGCAG GATATACTGG AAACTGGATC ATCTGTGGCA 

251 TCGGCTTGGG AATTATCGTA CTCACACTGA TTCTTGCTCT TCTTCTAGCA 

301 ATCCCTCTTA AAAATAAGCA GACAGGAACA AAACTGATTG ATGAGATATC 

351 TCAAGACATT TCCTCTATAG GATCAGGATT TGTTCAGAGA TACGGGTTGA 

° 401 TGTTCTCTAC AATTAAAAGC GTGCATCTTC CAGAGCTGAC AACACAAAAT 

451 CAAGAAAAAA CAAGAATTTT AAATGAAATT GAAGCGAAAA AGGAATCGAT 

501 CCAAAATCTT GAGCTTAAAA TTACTGAGTG CCAAAACAAG TTAGCACAGA 

551 AACAGCCGAA ACGGAAATCA TCTCAGAAAT CATTTATGCG TAGTATTAAG 

601 CACCTCTCCA AGAACC CTGT AATTTTGTTC GATTGCTGA 

20 The PSORT algorithm predicts inner membrane (0.6095). 

The protein was expressed in Exoli and purified as a his-tag product (Figure 179 A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
179B) and for FACS analysis. 

These experiments show that cp6723 is a surface-exposed and immunoaccessible protein, and that it 
25 is a useful immunogen. These properties are not evident from the sequence alone. 

Example 180 

The following C.pneumoniae protein (pid 4376749) was expressed <SEQ ID 359; cp6749>: 

1 MSYYFSLWYL KVQQHFQAAF DFTRSLCSRI SMFALGVIAL LPI IGQLYVG 
51 LDWLLSRIKK PEFPSDVDQI VRVEHWGHD HRSRVEDILK RQRXiSLEPRD 
■JO 101 EGKVHGDLPS APFF* 

The cp6749 nucleotide sequence <SEQ ID 360> is: 

1 ATGAGTTATT ACTTTTCTCT TTGGTATCTG AAGGTGCAAC AGCACTTTCA 

51 AGCAGCATTT GATTTTACTC GCTCCCTGTG TTCACGAATT TCTAATTTTG 

101 CTTTGGGAGT GATTGCATTG CTTCCTATTA TTGGGCAGTT GTATGTAGGG 

35 151 CTGGACTGGC TCCTCTCTAG GATAAAAAAG CCAGAATTTC CTTCCGATGT 

201 GGATCAGATC GTGCGAGTAG AACACGTCGT GGGTCACGAC CATAGAAGTC 

251 GAGTTGAAGA TATTCTAAAG AGACAAAGGC TCTCATTAGA GCCTAGAGAC 

301 GAGGGGAAGG TTCACGGAGA TCTGCCTTCA GCTCCTTTTT TTTGA 
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The PSORT algorithm predicts inner membrane (0.2996). 

The protein was expressed in Exoli and purified as a his-tag product (Figure 180A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
180B) and for FACS analysis. 

These experiments show that cp6749 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 
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Example 181 , 
Example 182 , 
Example 183 , 
Example 184 and 
Example 185 

The following C.pneumoniae protein (pid 4376301) was expressed <SEQ ID 361; cp6301>: 

1 LNQDLQNVYQ ECQKATGLES EVSAYRDHLR EQITEFETQG LDVIKEELIiF 

51 VSSTLKSKLS YDPLIADIPC MKFYEEYYDG IDKARVQSRW LEKSERYRKA 

101 KKGFQEMLKE GLFKEDQALK KAEYRLLREK RMNKEKLLIC NKIEAAOORV 

151 QEFGPSDS* 

The cp6301 nucleotide sequence <SEQ ID 362> is: 

1 TTGAATCAGG ATTTACAAAA TGTATACCAA GAGTGCCAGA AGGCTACAGG 

51 TTTAGAATCG GAAGTGAGTG CATATAGAGA TCATCTTAGA GAGCAGATCA 

101 CAGAGTTTGA AACTCAAGGG CTGGACGTGA TAAAAGAAGA ACTTCTTTTT 

151 GTGAGTAGTA CTCTCAAAAG TAAATTGAGC TATGATCCAT TAATAGCAGA 

2 01 CATTCCCTGT ATGAAGTTTT ATGAGGAGTA TTATGATGGC ATTGATAAAG 

251 CGAGAGTTCA ATCCCGATGG CTGGAGAAGT CTGAGAGGTA TAGAAAGGCG 

301 AAGAAGGGAT TCCAAGAGAT GCTGAAGGAA GGCCTATTCA AAGAAGATCA 

351 GGCTTTGAAA AAAGCAGAGT A^AGATTACT TCGAGAGAAG AGAATGAATA 

401 AGGAGAAGCT TTTGATTTGC AATAAGATAG AAGCAGCTCA GCAGCGAGTC 

451 CAAGAATTTG GACCCTCGGA TTCATAA 

The PSORT algorithm predicts cytoplasm (0,4621). 

The following ^pneumoniae protein (pid 4376558) was also expressed <SEQ ID 363; cp6558>: 

1 MNIPAPQVPV IDEPWNNTS SYGLSLKSSL RPITYItlLAI LAIATLMSVL 

51 YFCGIISVGT FVLGMLIPLS VCSVLCVAYL FYQQSSIEKT KVFSITSPSV 

101 FFSDEDLNLL LGREEDSVSA I DELLKNF P A DDFRRPKMLP YSNFLDEOGR 

151 PNESREEDSH TSKIL* 

The cp6558 nucleotide sequence <SEQ ID 364> is: 

1 ATGAACATAC CCGCTCCCCA AGTACCAGTC ATAGATGAGC CTGTAGTGAA 

51 CAACACAAGT AGCTATGGTC TTTCATTGAA AAGTAGTTTA AGACCGATTA 

101 CTTATTTGAT TTTAGCTATC TTAGCTATAG CCACACTGAT GTCTGTTCTC 

151 TACTTTTGTG GCATCATTAG TGTTGGGACG TTTGTTTTGG GCATGCTGAT 

2 01 CCCTCTATCG GTCTGCTCTG TTCTTTGCGT TGCCTATTTA TTCTATCAGC 
251 AATCTTCTAT AGAAAAGACT AAGGTCTTTT CTATAACCAG TCCTTCAGTA 

3 01 TTTTTCTCTG ATGAGGATCT TAATTTACTC TTAGGTCGAG AAGAAGATTC 
351 AGTGTCTGCA ATTGATGAAC TTCTTAAGAA CTTTCCAGCT GATGATTTCC 
401 GTAGGCCGAA GATGCTTCCT TATTCAAATT TTCTAGATGA GCAGGGAAGG 
451 CCTAATGAGA GTAGGGAAGA AGACTCTCAT ACTTCCAAGA TCTTATAA 

The PSORT algorithm predicts inner membrane (0.4630). 

The following C.pneumoniae protein (pid 43 76630) was also expressed <SEQ ID 365; cp6630>: 

1 MSMTIVPHAL FKNHCECHST FPLSSRTIVR IAIASLFCIG ALAALGCLAP 

51 FVSYIVGSVL AFIAFVILSL VILALIFGEK KLPPTPRIIP DRFTHVIDEA 

101 YGLSISAFVR EQQVTLAEFR QFSTALLCNI SPEEKIKQLP SELRSKVESF 

151 GISRLAGDLE KNNWPIFEDL LSQTCPLYWL QKFISAGDPQ VCRDLGVPRE 

201 CYGYYWLGPL GYSTAKAT I F CKETHHILQQ LTKEDVLLLK NKALQEKWDT 

251 DEVKAIVERI YTTYTARGTL KTEAGGLTKE TISKELLLLS LHGYSFDQLQ 

301 LITQLPRDAW DWLCFVDNST AYULQLCALV GALSSQNLLP ESSXDFDVNL 

351 GLYVIQDLKE AVQAFSASDE PKKELGKFLt, RHLSSVSKRL ESVL.RQGLHR 

401 IALEHGNARA RVYDVWFVTG ARIHRKTSIF FKD* 

The cp6630 nucleotide sequence <SEQ ID 366> is: 

1 ATGAGCATGA CGATCGTTCC ACATGCTTTA TTTAAAAATC ATTGCGAGTG 

51 TCATTCTACC TTTCCTTTGA GTTCAAGGAC TATTGTAAGA ATAGCCATTG 

101 CCAGCCTCTT TTGTATAGGT GCATTAGCAG CTTTAGGCTG TTTGGCTCCT 

151 CCCGTTTCTT ATATTGTTGG GAGTG TTTTA GCTTTTATTG CCTTTGTCAT 

2 01 TCTTTCTTTA GTAATTTTAG CTTTGATTTT TGGAGAGAAG AAGCTTC CAC 
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CAACACCAAG AATCATTCCT GATAGATTTA CTCACGTGAT AGATGAAGCT 
TATGGCCTTT CAATCTCTGC ATTTGTAAGA GAACAGCAGG TAACATTAGC 
CGAGTTTAGA CAATTTTCTA CTGCCCTGTT GTGTAACATA TCTCCTGAAG 
AGAAAATCAA ACAATTGCCT TCTGAATTGC GAAGTAAAGT AGAGAGTTTT 
GGTATTAGCA GGCTCGCAGG TGATTTAGAA AAGAATAATT GGCCAATATT 
TGAAGATCTT TTAAGCCAAA CCTGCCCGTT ATATTGGCTT CAGAAATTTA 
TATCAGCAGG AGATC CACAA GTTTGTAGAG ACCTAGGTGT CCCTAGAGAA 
TGTTATGGGT ACTATTGGCT AGGGCCTTTG GGATACAGTA CAGCTAAGGC 
TACAATTTTT TGTAAAGAGA CGCATCATAT TCTTCAACAA TTAACGAAAG 
AGGACGTTCT TTTATTAAAA AACAAGGCTC TTCAAGAGAA ATGGGATACT 
GATGAAGTCA AAGCAATTGT AGAGCGTATC TACACTACCT ATACGGCACG 
AGGAACTCTA AAGACCGAAG CAGGGGGACT TACAAAAGAG ACAATCAGTA 
AGGAATTGCT ATTGTTGAGC TTGCATGGCT ATTCTTTTGA TCAGCTACAG 
CTGATCACTC AACTTCCTAG AGATGCTTGG GATTGGCTGT GTTTTGTAGA 
TAACAGTACC GCATACAACC TTCAGCTTTG TGCTCTTGTA GGAGCTTTGT 
CATCCCAAAA TCTTCTTGAC GAATCTTCTA TCGATTTTGA TGTAAACCTA 
GGCCTGTATG TGATTCAGGA TCTAAAAGAA GCTGTTCAAG CATTTTCTGC 
TOCTGATGAG CCAAAGAAAG AACTAGGTAA ATTCTTGTTA AGGCATTTGA 
GTTCAGTTTC TAAGCGATTA GAGAGTGTAT TAAGACAGGG TCTTCACAGA 
ATAGCTCTAG AGCATGGAAA TGCCAGAGCT AGGGTTTATG ACGTCAATTT 
TGTAACAGGA GCTAGAATTC ATAGGAAGAC GAGTATCTTC TTTAAAGACT 
AA 

The PSORT algorithm predicts inner membrane (0.7092). 

The following C.pneumoniae protein (pid 4376633) was also expressed <SEQ ID 367; cp6633>: 

1 MVNIQPVYRN TQVNYSQATQ FSVCQPALSL IIVSWAAVL AXVALVCSQS 

51 liliSIEXiGTAI) VLA7SLILFAS AMFMIYKMRQ EPKELLIPKK IMELIQEHYP 

101 SIWDFIRDQ EVSIYEIHHL ISIIiNKTNVF DKAPVYLQEK LLQFGIEKFK 

151 DVHPSKLPNF EEILIiQHCPL HWLGRLVYPM VSDVTPGTYG YYWCGPLGLY 

201 ENAPSLFERR SLLLLKKISF GEFALLEDGL KKNTWSSSEL, VQIRQNLFTR 

251 YYADKEEVDE AELNADYEQF DSLL>HL>IFSH KLS* 

The cp6633 nucleotide sequence <SEQ ID 368> is: 

1 ATGGTTAATA TACAGCCTGT GTATAGGAAT ACCCAAGTCA ACTATAGTCA 

51 GGCTACCCAA TTTTCGGTGT GCCAGCCAGC GCTTAGCCTG ATTATCGTTT 

101 CTGTTGTTGC TGCTGTACTC GCTATTGTAG CTTTGGTATG CAGTCAATCT 

151 CTTTTATCCA TAGAGTTAGG AACTGCTCTT GTTCTAGTTT CTCTTATTCT 

201 TTTTGCTTCT GCTATGTTTA TGATTTATAA GATGAGACAA GAACCTAAGG 

251 AGTTGC TGAT CCCTAAGAAA ATCATGGAAC TCATCCAAGA ACATTATCCA 

301 AGTATTGTTG TTGATTTTAT TAGAGATCAG GAGGTTTCCA TTTATGAGAT 

351 ACATCACTTG ATCTCTATTC TTAATAAGAC GAATGTTTTC GACAAAGCAC 

401 CAGTATATTT ACAAGAAAAA CTCTTACAGT TTGGCATTGA GAAGTTCAAA 

451 GATGTACATC CAAGTAAGCT CCCTAATTTT GAAGAAATTC TTCTACAGCA 

501 TTGCCCATTG CATTGGTTGG GACGTCTGGT ATATCCCATG GTATCGGATG 

551 TCACTCCAGG AACCTATGGA TACTATTGGT GTGGTCCTTT AGGACTGTAC 

601 GAGAACGCTC CCTCTCTTTT TGAACGTCGA TCTCTTCTAT TGTTAAAGAA 

651 AATTAGCTTT GGAGAGTTTG CTCTTTTAGA AGATGGTCTC AAGAAAAACA 

701 CGTGGAGTTC TTCGGAACTC GTTCAAATCA GACAAAACCT TTTTACAAGA 

751 TATTATGCTG ATAAAGAAGA GGTAGATGAA GCAGAGTTAA ACGCTGATTA 

801 CGAACAGTTT GATTCCCTCC TTCACCTTAT TTTTTCTCAC AAGCTCTCTT 

851 GA 

The PSORT algorithm predicts inner membrane (0.7283). 

The following Cpnewnoniae protein (pid 437 6642) was also expressed <SEQ ED 369; cp6642>: 

1 MATISPISLT VDHPLVOTKK KSCSNFDKIQ SRILLITAIF AVLVTIGTLL 

51 IGLIiLNIPVI YFLTGISFIA WLSWFILYK RATTLLKPRA CGKHKEIKPK 

101 RVSTNLQYSS ISIAINRSKE NWEHQPKDLQ NLPAPSALLT DNPYEIWKAK 

151 HSLFSLVSLL PGGNPEHIiL I SASENLGKTL LIEETSQNAP ISSYVDTTPS 

201 PKSXjXjNEAIQ ETRVEINTEL PAGDSGERIiY WQPDFRGRVF LPQIPTTPEA 

251 1YQYYYALYV TYIQTAINTN TQIIQIPLYS LREHLYSREL PPQSRMQOSL 

301 AMI TAVKYMA EliHPEYPLTX ACVERSLAQL PQESIEDLS* 

The cp6642 nucleotide sequence <SEQ ID 370> is: 
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251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 



1 ATGGCTACAA TCTCACCCAT ATCTTTAACT GTAGATCATC CCCTAGTAGA 
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51 CACTAAAAAA AAATCCTGCA GCAACTTTGA TAAGATTCAG TCTCGAATTC 

101 TATTGATTAC TGCAATCTTT GCTGTCTTAG TTACTATAGG GACCCTACTT 

151 ATTGGTTTGC TTTTAAATAT TCCTGTTATC TATTTCCTCA CAGGAATTTC 

201 ATTTATTGCT GTTGTTCTTA GCAACTTTAT CCTTTATAAA CGAGCAACCA 

251 CCCTCTTAAA ACCGCGTGCT TGTGGCAAAC ACAAAGAAAT AAAACCAAAA 

301 AGGGTCTCCA CCAACCTACA GTATTCTTCT ATCTCTATCG CAATCAATCG 

351 TTCTAAAGAA AACTGGGAAC ACCAACCCAA GGACCTACAG AATCTCCCCG 

401 CACCCTCTGC ATTACTCACA GATAAC CCTT ACGAGATATG GAAAGCTAAA 

451 CATTCACTGT TTTCCCTAGT ATCCCTCCTA CCGGGAGGCA ATCCAGAACA 

501 TCTCTTAATT TCAGCTTCCG AAAATTTAGG AAAGACTCTG TTAATTGAAG 

551 AAACCTCGCA AAATGCGCCT ATATCCTCCT ACGTAGATAC CACTCCCTCC 

601 CCAAAATCCT TGCTCAATGA GGCAATTCAG GAAACCAGGG TAGAAATAAA 

651 TACAGAACTC CCTGCGGGAG ATTCAGGAGA ACGTTTATAC TGGCAACCCG 

701 ATTTCCGAGG CCGCGTCTTC CTCCCACAAA T AC C AACAAC TCCTGAAGCC 

751 ATCTACCAAT ACTAC TATGC ACTCTATGTC ACTTATATCC AGACTGCGAT 

801 CAATACGAAC AC CCAAATTA TCCAAATCCC TTTATACAGC TTGAGGGAGC 

851 ATCTCTATTC TAGAGAATTG CCCCCGCAAT CAAGAATGCA ACAATCTTTG 

901 GCTATGATTA CAGCAGTAAA ATACATGGCC GAGCTGC AC C CAGAATATCC 

951 GCTAACTATT GCTTGTGTTG AAAGATCCTT AGCCCAACTA CCTCAAGAAA 

1001 GTATTGAGGA TCTCTCTTAG 

The PSORT algorithm predicts inner membrane (0.5288). 

The proteins were expressed in E.coli and purified as GST-fusion products. The recombinant 
proteins were used to immunise mice, whose sera were used in Western blots (Figures 181-185) and 
for FACS analysis. 

These experiments show that cp6301, cp6558, cp6630, cp6633 and cp6642 are surface-exposed and 
immunoaccessible proteins, and that they are useful immunogens. These properties are not evident 
from their sequences alone. 



Example 186 

The following C.pneumoniae protein (PID 4376389) was expressed <SEQ ID 371; cp6389>: 

1 MSEVKPLFLK NDSFDLATQR FQNLINMLQE QAEIYNEYEE KNARVQNEIK 

51 EQKDFVKRCI EDFEARGLGV ItKEELASL/TR DFHDKAKAET SMLIECPCIG 

101 FYYSIHQEEQ RQRQERLQKM AERYRDCKQV LEAVQVEQKD MISSRVWDD 

151 SYFEEEKEEQ KVDNRKKEQD * 

The cp6389 nucleotide sequence <SEQ ID 372> is: 

1 ATGTCAGAAG TGAAGCCTTT GTTTTTAAAG AATGACTCTT TTGATTTGGC 

51 AACTCAGAGA TTCCAGAATC TAATTAACAT GCTACAAGAG CAAGCCGAGA 

101 TATATAACGA GTATGAAGAA AAGAATGCTA GGGTTCAGAA TGAGATTAAG 

151 GAGCAAAAGG AC TTTGTGAA AAGATGCATA GAGGACTTTG AAGC C AGAGG 

201 ACTGGGGGTG CTAAAAGAAG AGCTTGCATC TTTGACGCGT GATTTCCATG 

251 ATAAAGCAAA AGCAGAGACT TCTATGCTCA TTGAATGTCC TTGTATTGGT 

301 TTTTATTATA GTATTCATCA GGAGGAACAA AGGCAAAGGC AAGAAAGGCT 

351 TCAAAAGATG GCTGAGCGCT ATAGGGACTG TAAACAAGTC TTGGAGGCTG 

401 TCCAGGTGGA GCAAAAAGAT ATGATATCTT CTAGAGTCGT TGTCGATGAC 

451 AGCTACTTTG AAGAAGAAAA AGAAGAACAA AAGGTGGATA ACAGAAAGAA 

501 AGAACAGGAC TAG 

The PSORT algorithm predicts cytoplasm (0.3193). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 186A) and also in 
his-tagged form. The recombinant proteins were used to immunise mice, whose sera were used in a 
Western blot (Figure 186B) and for FACS analysis. 
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These experiments show that cp6389 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 187 

The following C.pneumoniae protein (PID 4376792) was expressed <SEQ ID 373; cp6792>: 

1 VLQEHFFLSE DVITLAQQLL GHKLITTHEG liITSGYIVET EAYRG PDDKA 

51 CHAYNYRKTQ RNRAMYLKGG SAYLYRCYGM HHLLNWTGP EDIPHAVLIR 

101 AILPDQGKEL MIQRRQWRDK PPHLLTNGPG KVCQALGISL EKGKFRQRLNTP 

151 ALYISKEKIS GTLTATARIG IDYAQEYRDV PWRFLLSPED SGKVLS* 

The cp6792 nucleotide sequence <SEQ ID 374> is: 

1 GTGCTACAAG AACATTTTTT TCTATCGGAA GATGTAATTA CACTAGCGCA 

51 ACAGCTTTTA GGACATAAAC TCATCACAAC ACATGAGGGT CTGATAACTT 

101 CAGGTTACAT TGTAGAAACC GAAGCGTATC GTGGCCCTGA TGACAAAGCA 

151 TGCCACGCCT ACAACTACAG AAAAACTCAG AGGAACAGAG CGATGTACCT 

1£; 201 GAAAGGAGGC TCTGCTTACC TCTACCGTTG CTATGGCATG CATCACCTAT 

D 25 1 TGAATGTTGT CACTGGACCT GAGGACATTC CCCATGCCGT CCTGATCCGG 

3 01 GCCATCCTTC CTGATCAAGG CAAAGAACTT ATGATCCAAC GCCGCCAATG 

351 GAGAGATAAA CCCCCACACC TTCTCACCAA TGGACCCGGA AAAGTGTGCC 

401 AAGCTCTAGG AATCTCTTTG GAAAACAATA GGCAACGCCT AAATACCCCA 

45 1 GCTCTCTATA TCAGCAAAGA AAAAATCTCT GGGACTCTAA CAGCAACTGC 

lK) 501 CCGGATCGGC ATCGATTATG CTCAAGAGTA TCGTGATGTC CCATGGAGAT 

551 TTCTCCTATC CCCAGAAGAT TCGGGAAAAG TTTTATCTTA A 

The PSORT algorithm predicts cytoplasm (0.1 80). 

The protein was expressed in Kcoli and purified as a his-tagged product (Figure 187 A; lanes 2-4). 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
25 (Figure 1 87B) and for FACS analysis. 

These experiments show that cp6792 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 188 

The following C.pneumoniae protein (PID 4376868) was expressed <SEQ ID 375; cp6868>: 

30 1 MVETVLHNFQ RYLSKYLYRV FRFPCRKKTF LSSHRVLARP SFPVDYCPGK 

51 IYDLQEIYEE LNAQLFQGAIi RLQIGWFGRK ATRKGKSWL GLFHENEQLI 

101 R1HRSLDRQE I PRFFMEYLV YHEMVHSWP REYSLSGRSI FHGKKFKEYE 

151 QRFPLYDRAV AWEKANAYLL RGYKKRVGGG YGRA* 

The cp6868 nucleotide sequence <SEQ ID 376>is: 

35 1 ATGGTTGAAA CAGTACTTCA TAATTTCCAA CGTTATCTGA GCAAGTATCT 

51 CTATAGGGTA TTTCGCTTCC CATGTCGTAA AAAGACGTTC CTATCTTCGC 

101 ACAGGGTTCT TGCTCGTCCT TCATTCCCAG TAGACTACTG TCCGGGAAAG 

151 ATCTATGATT TGCAGGAGAT CTATGAGGAA TTGAATGCGC AGTTATTTCA 

201 AGGTGCACTG CGTTTACAGA TTGGTTGGTT CGGAAGGAAA GCTACCAGAA 

40 251 AAGGCAAGAG TGTTGTCTTG GGATTGTTTC ATGAAAATGA ACAGTTAATT 

3 01 CGAATTCATC GTTCTTTAGA TCGGCAGGAA ATCCCAAGAT TTTTTATGGA 

351 ATATC TTGTG TATCATGAAA TGGTTCATAG TGTAGTCCCT AGAGAGTATT 

401 CTCTATCGGG GCGTTCGATT TTTCATGGTA AAAAGTTTAA AGAATACGAA 

451 CAACGTTTCC CCTTGTATGA TCGTGCTGTT GCTTGGGAAA AGGCAAACGC 

45 501 TTATTTATTG CGAGGGTATA AAAAAAGAGT AGGTGGAGGA TATGGCAGGG 

551 CATAG 



The PSORT algorithm predicts bacterial cytoplasm (0.325). 
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The protein was expressed in Rcoli and purified as a his-tag product (Figure 188A; lanes 2-3). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
188B) and forFACS analysis. 

These experiments show that cp6868 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 189 

The following ^pneumoniae protein (PID 4376894) was expressed <SEQ ID 377; cp6894>: 
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1 


MYKRCVLDKI 


51 


SRWKQQQTS 


101 


QQTLPEIiLGT 


151 


SPHVGKYEEF 


201 


PKHVQLDEVF 


251 


VSVENDLKLV 


301 


FANGEKIIED 


351 


IVFSRNPDFY 


401 


DNFYSFMKSS 


451 


CAMNMAIDRE 


501 


RLLEEEGWID 


551 


ACKEIGIECS 


601 


EGAMEKGSAN 


651 


PYAFLFSRHC 


701 


DPCLSTS* 



The cp6894 nucleotide sequence <SEQ ID 378> is: 

- 1 ATGTATAAAA GATGTGTGCT AGATAAAATT TTAAAGGGGA TTGTCGCCGG 

23 51 TTCTTTAATT TTGTTATACT GGTCCTCAGA CCTACTTGAA AGAGACATTA 

101 AGTCGATAAA AGGTAACGTA AGAGATATTC AAGAAGACAT TCGTGAAATC 

151 TCACGC GTAG TGAAACAACA GCAGACATCA CAAGCTATCC CTGCGGCACC 

2 01 TGGGGTGATG CTCGCTCCTA AGCTCGTCAG AGACGAAGCT TTTGCTCTAC 

251 TCTTTGGAGA TCCTAGTTAT CCTAATTTAC TTTC CCTAGA CCCCTATAAA 

301 CAGCAGACTC TTCCTGAAOT TCTAGGAACA AATTTCCACC CTCATGGTAT 

351 CCTACGCACT GCCCATGTCG GAAAACCCGA AAATCTGAGC CCTTTTAATG 

401 GCTTTGATTA TGTCGTGGGC TTTTACGATC TCTGTATTCC TAGTTTAGCT 

451 TCTCCCCACG TAGGGAAATA CGAAGAATTT TCTCCAGATC TCGCTGTGAA 

501 AATAGAAGAA CATCTTGTTG AAGATGGTTC TGGGGATAAA GAGTTTCACA 

•35 551 TCTATCTGAG GCCGAATGTT TTTTGGCGTC CTATAGATCC TAAGGCCCTT 

601 CCAAAACACG TTCAGTTAGA CGAAGTATTT CAACGTCCTC ATCCTGTGAC 

651 AGCTCATGAT ATTAAGTTTT TCTACGACGC TGTTATGAAC CCTTATGTAG 

701 CAACCATGCG AGCAGTGGCT CTGCGCTCTT GTTATGAAGA TGTGGTTTCT 

751 GTCTCAGTAG AAAACGATTT AAAATTAGTA GTCAGATGGA AAGCACACAC 

4U 801 GGTAATCAAT GAAGAAGGAA AGGAAGAGCG CAAAGTGCTC TACTCTGCAT 

851 TTTCTAATAC CTTAAGCTTG CAGCCCCTCC CTAGATTTGT ATATCAGTAT 

9 01 TTTGCTAACG GGGAAAAAAT CATTGAAGAT GAGAATATCG ATACCTACCG 

951 AACCAATTCC ATTTGGGCGC AAAACTTCAC TATGCATTGG GCAAACAACT 

1001 ATATTGTAAG TTGTGGAGCC TACTACTTTG CAGGGATGGA TGATGAGAAA 

45 1051 ATCGTGTTTT CTAGAAATCC TGACTTCTAT GATCCTCTTG CGGCTCTTAT 

1101 TGACAAGCGT TTCGTCTATT TTAAGGAAAG CACAGACTCC CTATTCCAAG 

1151 ATTTTAAGAC AGGGAAAATA GACATCTCTT ACCTTCCACC CAACCAAAGA 

12 01 GATAATTTCT ATAGTTTTAT GAAAAGCTCC GCTTATAACA AACAGGTAGC 

1251 TAAGGGAGGA GCCGTCCGTG AAACAGTCTC AGCAGATCGA GCATATACGT 

50 1301 ACATAGGATG GAATTGCTTT TCATTATTTT TCCAAAGCCG ACAGGTGCGC 

1351 TGTGC TATGA ACATGGCAAT C GAT AG AGAG AGGATTATCG AACAGTGCTT 

1401 GGATGGCCAA GGCTATACGA TTAGTGGGCC TTTTGCTTCG AGTTCTCCTT 

1451 CTTATAATAA ACAGATCGAA GGGTGGCATT ATTCTCCAGA AGAAGCAGCT 

1501 CGTCTCCTGG AAGAAGAGGG ATGGATAGAT ACCGATGGCG ATGGAATCCG 

55 1551 AGAAAAAGTT ATCGATGGTG TGATTGTCCC GOTCCGTTTC CGTTTATGCT 

1601 ATTATGTAAA GAGTGTCACC GCTCATACCA TTGCAGATTA CGTAGCTACT 

1651 GCTTGTAAGG AAATCGGAAT CGAGTGTAGC CTTCTAGGAC TAGATATGGC 

1701 CGATCTTTCG CAAGCTTTTG ATGAAAAGAA TTTCGATGCT CTTTTAATGG 

1751 GATGGTGTTT AGGAATTCCT CCTGAGGATC CTAGGGCTTT ATGGCATTCT 
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1801 GAAGGGGCTA TGGAAAAGGG TTCAGCGAAT GTTGTAGGTT TCCATAATGA 

1851 AGAAGCTGAT AAAATCATAG ACAGACTCAG CTACGAATAC GATCTGAAAG 

1901 AACGTAATCG CCTGTACCAC CGTTTCCATG AAATTATTCA TGAGGAAGCT 

1951 CCTTATGCTT TCTTGTTCTC ACGACATTGT TCCTTACTTT ATAAGGATTA 

2001 TGTAAAAAAT ATTTTCGTAC CTACACATAG AACAGATTTA ATTCCTGAAG 

2051 CTCAGGATGA GACTGTCAAC GTAACTATGG TATGGCTTGA GAAGAAGGAG 

2101 GATCCGTGCT TAAGTACATC CTAA 

The PSORT algorithm predicts inner membrane (0.162). 

The protein was expressed in Rcoli and purified as a his-tag product (Figure 189 A) and also in 
GST/his form. The recombinant proteins were used to immunise mice, whose sera were used in a 
Western blot (Figure 189B) and for FACS analysis. 

These experiments show that cp6894 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 190 

15 The following C.pnewnoniae protein (pid 4377193) was identified in the 2D-PAGE experiment 
<SEQID379;cp7193>: 

1 MKRVIYKTIF CGLTIiLT SItS SCSLDPKGYN LETKNSRBLN QESVILKENR 

51 ETPSLVKRLS RRSRRLFARR DQTQKDTLQV QANFKTYAEK ISEQDERDliS 

n 101 FWS SAAEKS SXSL.ALSQGE IKDAXjYRIRE VHPLALIEAL AENPALIEGM 

2U 151 KKMQGRDWIW NLFLTQLSEV FSQAWSQGVI SEEDIAAFAS TLGIiDSGTVA 

201 SIVQGERWPE LVDIVIT* 

A predicted leader peptide is underlined. 

The cp7193 nucleotide sequence <SEQ ID 380> is: 

1 ATGAAAAGAG TCATTTATAA AACCATATTT TGCGGGTTAA CTTTACTTAC 
25 51 AAGTTTGAGT AGTTGTTCCC TGGATCCTAA AGGATATAAC C TAG AG AC AA 

101 AAAACTCGAG GGACTTAAAT CAAGAGTCTG TTATACTGAA GGAAAACCGT 
151 GAAACACCTT CTCTTGTTAA GAGACTCTCT CGTCGTTCTC GAAGACTCTT 
201 CGCTCGACGT GATCAAACTC AGAAGGATAC GCTGCAAGTG CAAGCTAACT 
251 TTAAGACCTA CGCAGAAAAG ATTTCAGAGC AGGACGAAAG AGAC CTTTCT 
30 301 TTCGTTGTCT CGTCTGCTGC AGAAAAGTCT TCAATTTCGT TAGCTTTGTC 

351 TCAGGGTGAA ATTAAGGATG CTTTGTACCG TATCCGAGAA GTCCACCCTC 
401 TAGCTTTAAT AGAAGCTCTT GCTGAAAACC CTGCCTTGAT AGAAGGGATG 
451 AAAAAGATGC AAGGCCGTGA TTGGATTTGG AATCTTTTCT TAACACAATT 
501 AAGTGAAGTA TTTTCTCAAG CTTGGTCTCA AGGGGTTATC TCTGAAGAAG 
35 551 ATATCGCCGC ATTTGCCTCC ACCTTAGGTT TGGACTCCGG GACCGTTGCG 

601 TCCATTGTCC AAGGGGAAAG GTGGCCCGAG CTTGTGGATA TAGTGATAAC 
651 TTAA 

The PSORT algorithm predicts periplasmic (0.925). 

This shows that cp7193 is an immunoaccessible protein in the EB and that it is a useful immunogen. 
40 These properties are not evident from the protein' s sequence alone. 

It will be appreciated that the invention has been described by way of example only and that 
modifications may be made whilst remaining within the spirit and scope of the invention. 
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TABLE II - sequences of the primers used to amplify Cpn genes. 



Orf ID 


N-terminus final primer 


C-terminus final primer 


CP0014P 


GCGTC ccg ggtcatatg aagtcttctttcccca 




CP0O15P 


GCGTCCCGGGTCATATG TCAGCTCTGTTTTCTGA 


rtPnT P(PP />7l O P>HWfiimni..,i 


CP0016P 


GCGTCCCGGGTCATATG GCCGATCTCACATTAG 


GCGT CTC GAG GTCCAAGTTAAGGTAGCA 


CP0017P 


GCGT CCG GGTCATATG GGTATCAAGGGAACTG 


l9LL>l L1L GAG AAA i a_ <JG AATCT TCC 


CP0019P 


GCGTCCCGGGTCAT ATGCAAGACTCTCAAGACTATAG 


GCGT CTC GAG AAATCGGTATTTACCC 


CP6260P 


GCGTC CCG GGT GCTAGCACTACGATTTCTTTAACCC 


GCGT CTC GAG AAAACGaaATTTGCTTC 


CP6397P 




GCGT CTC GAG ATGAAAGaAGAGTCCTCG 


CP6456P 


GCGTC CCG GGT CATATG TCATCTCCTGTAAATAACA 


GCGT CTC GAG CTGACCATCTCCTGTT 


CP6466P 


^JJ.^- V3UI X t\ ±\J 1 'wi-V/Awj V/ri\J7 X H^ALj X 


GCGT CTC GAG ATTTTCCTTAGCATAACG 


CP6467P 


GCGTC CCCi fZfZT 1 pit ?i cpfi Tfin«rr , r , r'r , R'T'r , r , r ,, Mt 
VIV -" iv -* ^-wv» viva J. w** W« lu 1 Jl Lv.v.Ui 1 ULLnn 


GCGT CTC GAG TAGTTTTTCTATAAAACGAAAGTCT 


CP6468P 


v-v-vj vjo 1 U-rV J- rti-lj l^L 1 1 Av. XCTI C. 


GCGT CTC GAG GGGGAAATAGGTATATTTGA 


CP6469P 


GCGTC CCd CZCv CAT* ATfi ACr"PCr^pr , \ i a nr^-r, n 
vVvj Vava J. ( — t\ X n.1 Vj *4Jj>v l^vJiLAAAGLAA 


GCGT CTC GAG ACTTAAGATATCGATATTTTTGA 


CP6552P 


•jujx*- v-i^vj (jri^i l»AA Al« IvjLLAI AAGGAAGATG 


GCGT CTC GAG ACCATTGTCTTGAGTCAT 


CP6567P 


-i *- v,v.^s vj>j 1 LA-L AZu ALLrxUA,L\-GAi (JCJCC 


GCGT CTC GAG AGAAGCCGGTAGAGGC 


CP6576P 


v/^ox^. *-Lvj woi LAI Alts AL 1 GAAAAAG I TAAAG AAGG 


GCGT CTC GAG GAA CATGCCCCCTAA 


CP6727P 


LV-la vitji Ui XiiitjL lAL^ICGACU AATGGC 


GCGT CTC GAG GAAAGAATAACGAGTTCC 


CP6729P 


GCGTC CCG GGT CAT ATGGCAGATGCTTCTTTATC 


GCGT CTC GAG GAATGAGTATCTTAGCC 


CP6731P 


(iCCZ'VC CCC2 PPfp P& PV^PP'TV"utiitv- , "Ti'7V-« 71 > n iiv~i> n m 

V3V -- VJ ■*• *~ laljl LAAAlGGLi\/rTGTTGAAATCAAT 


GCGTC CAT GGC GGC CGC GAACTGGAACTTACCTCC 


CP6736P 


o^vs 1 u Lrfj 1 vjL X AGLbTAGAAG TTATCATGCCTT 


GCGTC CAT GGC GGC CGC AAATCGTAATTTGCTTC 


CP6737P 


w^-v» x >Jvj>i 1 LL LAi Alt vjAGACTAGACTCGGAGG 


GCGT CTC GAG AAATGTGGATTTTAGTCC 


CP6751P 


vav-ia i W Vala A tiv- -1 AGL. AA1 GAAGG 1 CTCCAACT 


GCGT CTC GAG AAATCTCATTCTACTCGC 


CP6752P 




GCGT CTC GAG GAATTTTAAGGTACTTCCTG 


CP6753P 


vi vsrVal Gv-l AGG AuTCCCT ACTCTC ATAGAG 


GCGT CTC GAG AAACTTAAAGGTCGTTC 


CP6767P 


w*-v> l v» Vv_.kj lava x L/i J. Aib Al AAAAGAAA x AQjGCCGT 


GCGT CTC GAG TTCGTAAGCAACTTCAGA 


CP6829P 


va 1 — >j x »— 13 l LAX Alts AAGGAvaAlAjLGIGTTT 


GCGTC CAT GGC GGC CGC GAAACTAAGGGAGAGGC 


CP6830P 


^»t-la Ulax LnX nlu UfAlU'— I.GGG TL X Gil 


GCGTC CAT GGC GGC CGC GAATACAAACCGGATCC 


CP6832P 


GCGTC CCCZ CZCZT* CllT 1TP PI ITK 2 SPUl « Hit r<mn»ninni tnmn 

uwuav. '"V-vj wvja v.rt x niu (.ni AAAVsx AAlAGTu TTCATTT 


GCGT CTC GAG TAAACTAGAAAAAGTCGTC 


CP6848P 


uvuiv vjux \*t\± rtlAj 1LA1 LAAATL 1ALAT UGC 


GCGT CTC GAG AACGCGAGCTATTTTAC 


CP6849P 


ftPRTT 1 CCCZ flPT fiPT* ipp apppppppntTtinTvnji^ 


GCGT CTC GAG ATACACGTGGGTATTTTC 


CP6850P 




GCGT CTC GAG CTGTTTGCATCTGCC 


CP6854P 


wwlv ' v -'"^-' uvi ■riv/v- 1 LftAinbU lAi IbL/Ulb 


GCGT CTC GAG TTATCGAAATGTCTTTG 


CP6879P 




GCGTC CAT GGC GGC CGC TCCTTGAAATTGCTCTTGC 


CP6894P 


GCGTC CCG GGT CAT AT(2 1 , AT*an&afian i Pfpp'Ttppni»pTi 


GCGT CTC GAG GGATGTACTTAAGCACG 


CP6900P 


GCGTC CCG GGT CAT ATG AAf5RTA3i3iLT if P*T ir PptT«rpn7ATip 


GCGT AAG CTT GGGAAGACGATACCG 


CP6952P 


GCGTC CCG GGT CAT ATfi pmprp/~.r:y-;ji "pr* a a fp/ifrtRrn ft/-! r< 


GCGT CTC GAG TCGAATTTCTTTTTTAGC 


CP7034P 


GCGTC CCG GGT CAT ATC! Aiiai APar2f3fp»tpjirnr , 7A 7atip 


GCGT AAG CTT AAACGCTGAAATTATACC 


CP7090P 


GCGTC CCG GGT CAT ATG TGTAGCCTTTCCCCT 


GCGT CTC GAG GCGTGCATGAATCTTA 


CP7091P 


GCGTC CCG GGT CAT ATG GAAGAATTAGAAGTTGTTGT 


GCGT CTC GAG TAGTGTTCTC^PTTATCGGT 


CP7170P 


GCGTC CCG GGT CAT ATG CTAGGGGCTGGAAACC 


GCGT AAG CTT AAACTGCAGACCTGACG 


CP7228P 


GCGTC CCG GGT CAT ATG ACTGCTGTTCTTATTCTTACA 


GCGT CTC GAG ATCTGAAAGCGGAGG 


CP7249P 


GCGTC CCG GGT CAT ATG ATCCCATCCCCTACC 




CP7250P 


GCGTC CCG GGT CAT ATG AATCTTTCAAACAGGTCT 


GCGT CTC GAG ATTTTTTCTAGAGAGACTCTC 


CP0018P 


GTGCGT CATATG GCAACCACTCCACTAA 


ACTCGCTA GCGGCCGC TAATGAGGTCCCCAG 


CP6270P 


GTGCGT CATATG AATTTATTAGGAGCTGCT 


ACTCGCTA GCGGCCGC AAATTTGATTTTGCTACC 


CP6735P 


GTGCGT CATATG GCAGCACAAGTTGTATAT 


ACTCGCTA GCGGCCGC TGGCGTAGAAGTGATC 


CP6998P 


GTGCGT CATATG TTGCCTGTAGGGAAC 


ACTCGCTA GCGGCCGC GAATCTGAACTGACCAGA 


CP7033P 


GTGCGT CATATG GTTAATCCTATTGGTCCA 


ACTCGCTA GCGGCCGC TTGGAGATAACCAGAATATA 


CP7287P 


GTGCGT CATATG TTACACAGCTCAGAACTAGA 


ACTCGCTA GCGGCCGC GAAAATAATACGGATACCA 


CP0010P 


GTGCGT CATATG GCAACTGCTGAAAATATA 


GCGT CTCGAG GAATTGGAACTTACCC 


CP0468P 


GTGCGT GCTAGC A.TTTTTTATGACAAACTCTAT 


GCGT CTCGAG AAATGTGCAATGACTCT 


CP6272P 


GTGCGT CATATG TTGACTCATCAAGAGGCT 


GCGT CTCGAG GAAGGGAGGTTTTTTAGGT 


CP6273P 


GTGCGT CATATG ACATATCTGGAAGCTC 


ACTCGCTA GCGGCCGC CTCCACAATTTTTATG 


CP6362P 


GTGCGT CATATG CCCTTTGATATTACTTATTATACA 


GCGT CTCGAG TCGTTTCCAAATCCA 


CP6372P 


GTGCGT CATATG AAACAACACTATTCTCTAAATA 


GCGT CTCGAG TTTCTTGTGGTTTTTCT 


CP6390P 


GTGCGT CATATG CGAGAGGTGCCTAAG 


ACTCGCTA GCGGCCGC TCTCCTAGACAGCCTT 


CP6402P 


GTGCGT CATATG AATGTTGCGGATCTCCTTT 


GCGT CTCGA3 GAAGGGGTTGGCCGT 


CP6446P 


GTGCGT CATATG TGTAATCAAAAGCCCTCTT 


GCGT CTCGAG GGGCTGAGGAGGAAC 


CP6520P 


GTGCGT GCTAGC AAACACTACCTATCATTTTCT 


GCGT CTCGAG CAGAAAGGCTTTTCTTT 


CP6577P 


GTGCGT CATATG AATTTAGGCTATGTTAATTTA 


GCGT CTCGAG GTTTTGTTTTTTGAAAGA 


C?6602P 


GTGCGT CATATG GCAGCATCAGGAGGCA 


GCGT CTCGAG TGACCAAGGATAGGGTTTAG 



BNSDCCID; <WO 02026O6A2_L> 



WO 02/02606 



-189- 



PCT/IB01/01445 



CP6607P 


GTGCGT CATATG CCTCGTGGTGACACTTT 


K3\-\jx xa^vjAwj COCTGCTTC. * 1 G CTC 




CP6615P 


GTGCGT CATATG TGCTCTCAAAAAACGACAA 






CP6624P 


GTGCGT CATATG GATGCGAAAATGGGA 


GCGT CTCGAG TCTTTGACATTCAAGAGC 




CP6672P 


GTGCGT CATATG ATTCCTACCATGTTAATG 


GCGT CTCGAG GTCATACAATTTCCTTATATA 


CP6679P 


GTGCGT CATATG TGCACTCACTTAGGCT 


GCGT CTCGAG CGAGTAGTTAGCACAAAC 




CP6717P 


GTGCGT GCTAGC AAGACAATCGTAGCTTCA 


ACTCGCTA GCGGCCGC GGCTGGCATATAGGT 




CP6784P 


GTGCGT GCTAGC AAATCAAGATGTTCTATTGATA 


GCGT CTCGAG TCCAAAACAACCCTCT 




CP6802P 


GTGCGT CATATG TGCGTAAGTTATATTAATTCCTT 


GCGT CTCGAG CAGTCGGGCTTGTTG 




CP6847P 


GTGCGT CATATG TCGGATCTTTTACGAG 


GCGT CTCGAG tTTTCTACACTGTTGTAATAAA 


CP6884P 


GTGCGT CATATG AATCAGCTGCTTTCT 


GCGT CTCGAG AGAGAAGGTAATTGTACC 




CP6886P 


www ^ n^* 7 * ifij. u x x Al IATl MAIL* -La* X AC- 


GCGT CTCGAG TTCAGAAAAATGGCT 




CP6890P 


GTGCGT CATATG TCCTOArrcAmar'Si 


GCGT CTCGAG TCCTGCAGCATTTAGC 




CP6960P 


vx-vhjv-n?* vnifil \& X \J lunUu 1 X V- X\A 


ACTCGCTA GCGGCCGC TTCACCTTGATTTCCT 




CP6968P 




ACTCGCTA GCGGCCGC GGAAGTATGCTTAGATATT 


CP6969P 


w * www*. s — * * AnAvj iu^j.i^iuv3"i" TAL "X'LTX WX M JL' 


ACTCGCTA GCGGCCGC AAAAAGGTCATAGTATACCT 


CP7005P 


v * w w v a -f* V3 jfvw\r» w, XA» X l^iAl A X IasAAwA 


GCGT CTCGAG CTGAGCTTCTATTTCTATTAT 




CP7072P 


GTGCGT CATATG rrrawfi'npr'r'i & * 


GCGT CTCGAG GTTGAGCAAAGGTTTG 




CP7101P 


GTGCGT CAT ATR Tft w^r 1 Tr* 1 T*m7i r* i\ nr* n n 


GCGT CTCGAG GAAAAATTCTTTAGGGAG 




CP7102P 


X f\X\j u'V.v.O'l-. lAAAtaCAAAT 


GCGT CTCGAG TGAAAATGAAAGGATGGT 




CP7105P 




GCGT CTCGAG ATCTTTCATTTGGTTATCT 


CP7106P 


° ,UV ' WJ ^-«.J.*»i Va A*iAOfi J. 1 1 valatila AL TC i. 


GCGT CTCGAG GAATCCTAAGGCATACCTA 




CP7107P 


x a v'^- J- Ala i'A X AWC AG AAA TT CTti C*A 


GCGT CTCGAG GAAGCTAAGATTATAGCTACTTT 




CP7108P 


"J'AVS'-vJA \JV»j.,rtVJw (jwLjlsL-l^v-X 1 XL-C-A 


ACTCGCTA GCGGCCGC TTTATGTATATGGAACAGATAGG 




CP7109P 


GTGCGT CATATG GGACATTTTATTGATATTG 


ACTCGCTA GCGGCCGC ATCATCAAGGTAGATAAAG 




CP7110P 


V3j.va\_\ji v.*\j.fvxv3 *j w> a lAi 1 ^CTATvj TAATT ACA 


GCGT CTCGAG TTCTGATTGGACTCCA 




CP7127P 


taiuvvji c^iiAlLi U- 1 C. TT TAACGATAGC 


ACTCGCTA GCGGCCG GCAGCCATCGTATTC 


CP7130P 


* *• J. J. tnAlAJ oLv?AuG 


GCGT CTCGAG CTTCTTATTTGAACTTTG 


CP7140P 


^ ^wvj x v»n±*tnj wrtv w.L-wwxMjvAIjw X 


GCGT CTCGAG AGCACCCTCAATTTCATTG 




CP7182P 


« ± ur wv? * jl x uuA iAlblTOL TATGTUATC 


GCGT CTCGAG GCTACTAAATCGAATCGA 


CP6262P 


GTGCGT CATATG 1 ipff CTV^fl & t*t> i&fPi^t 


ACTCGCTA GCGGCCGC TTCACTGGGAGCTTGA 


CP6269P 


.» w- 4. v.iiArtlW ArtL w Avj»vj A la Aft 1 L. X AAv» AT 


ACTCGCTA GCGGCCGC GATTTTCTTCTTCAGCTC 


CP6296P 


v* i vs w vj i ^ttiAjij (j AvjfjAtsia X l»TC TG AGT AT 


ACTCGCTA GCGGCCGC ATGTTTCTTTTTACTCTTTCT 


CP6419P 


^ A V* J. Wll/lIW X t,i^>-VV^T J. WV^vj ±\J X i. 


GCGT CTCGAG AAGTGTTCGTTGGAAGT 


CP6601 P 




GCGT CTCGAG GAAAATCTGAATTCTTCCT 


CP6639P 


«i«v,«t v^inns X i. ftHA J. X wftAw;\_ 1 Xa»> A 


GCGT CTCGAG AGGAACTAAAACCTCATCT 


CP6664P 


GTGCGT GCTAGC fiTT f P*T , li r P'T , 'T*r , afpr , r , *T»nii a 

vxuwa ULtiAUu U1XXXI\XXX i. vstw ILAA 


ACTCGCTA GCGGCCGC CTTAGAAAGACTATTTTCTAAGTA 


CP6696P 


GTGCGT CATATG TfiCfiTGliTaj-.'TWirs 


GCGT CTCGAG ATTCATCTTCGTAAAGAAT 


CP6757P 




ACTCGCTA GCGGCCGC CTGTCCCTCTGGAGC 


CP6790P 


GTGCGT GCTAGC AGTGAACACAAAAAATCA 


ACTCGCTA GCGGCCGC CTTATCGTCGTTATCAATA 


CP6814P 




GCGT CTCGAG TACAGCTGCGCGA 


CP6834P 




GCGT CTCGAG TACATTTGTATTGATTTCAG 


CP6878P 


GTGCGT CATATG AACGTCCCTGATTCC 


GCGT CTCGAG GCTAGCGGCTCTTTC 


CP6B92P 


GTGCGT CATATG CAGAAGCATCCTTCCT 


ACTCGCTA GCGGCCGC TCCTCTTTAGGAAATGG 


CP6909P 


GTGCGT CATATG TCCTCTTTAGGAAATGG 


GCGT CTCGAG CAGTGCCAAGTAGGGA 


CP7015P 


GTGCGT CATATG GCAGTACGATTAATTGTTG 


GCGT CTCGAG TTTATTGTAGTCTATTTTATATTTC 


CP7035P 


GTGCGT GCTAGC AGCAGAAAAGACAATGA 


GCGT CTCGAG ATTTTGAGTGTCTTGCA 


CP7073P 


GTGCGT CATATG ATTACCATAAATCACGTG 


GCGT CTCGAG TATCCATCGACTTATAGC 


CP7085P 


GTGCGT GCTAGC TGTATTTTCCCTTACGTA 


ACTCGCTA GCGGCCGC GGATTCTGCATACTCTG 


CP7092P 


GTGCGT CATATG TCTCCTCTTCCTAAAAAA 


GCGT CTCGAG GGATTCATTACTGACCA 


CP7093P 


GTGCGT CATATG AAATACCGCTTCACG 


GCGT CTCGAG ATTCTGTAGGGCTACGT 


CP7094P 


GTGCGT CATATG GTACACTTCTCTCATAACCC 


GCGT CTCGAG TAAGTTTGTATTGCGGTAT 


CP7132P 


GTGCGT CATATG TTGTTATTAGGGACTTTAGGA 


GCGT CTCGAG TTTCCCAACCGCA 


CP7133P 


GTGCGT CATATG GCTGCGAATGCTC 


GCGT CTCGAG TAATTTAATACTCTTTGAAGG 


CP7177P 


GTGCGT CATATG CCTACTCAAGTTAAAACAGA 


GCGT CTCGAG AAGTTTATATTTCAGCACTT 


CP7184P 


GTGCGT GCTAGC CATATAGGATTTTGCCA 


GCGT CTCGAG GTACTTAGCAAAGCGAT 


CP7206P 


GTGCGT GCTAGC AAGAAGCTATATCACCCTA 


GCGT CTCGAG CACACCGAGGAAAC 


CP7222P 


GTGCGT CATATG GTAGTTTCAGAAGAAAAAGTC 


GCGT CTCGAG ACGTATGCGCAACTG 


CP7223P 


GTGCGT CATATG GAAGTATTAGACCGCTCT 


GCGT CTCGAG CGAGAAAAAGCTTCC 


CP7224P 


GTGCGT CATATG ATGAAGAAAATTCGAAA 


ACTCGCTA GCGGCCGC TAAGCATTCACAAATGA 


CP7225P 


GTGCGT CATATG CATATTTTGCTTGATCGT 


GCGT CTCGAG TCTTTTAACTAAATCTTGTTCTT 


CP7303P 


GTGCGT CATATG CTTGTCTATTGTTTTGATCC 


3CGT CTCGAG AAAATATACGGAACTCGC 


CP7304P 


GTGCGT GCTAGC GAAGTTTATAGTTTTTCCC 


3CGT CTCGAG TTTTTGATTCCTTAAGAAG 


CP7305P 


GTGCGT CATATG GAAGTTTATAGTTTTCACCCT 


3CGT CTCGAG ACTCCTTGAGAAGGGAA 


CP7307P 


GTGCGT CATATG CTTAATCATGCTAAAAAGC 


ACTCGCTA GCGGCCGC CTCTTTTATTTTAGGAAGCT 
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CP7342P 


GTGCGT CATATG AAAAAAAAATTTATTTTCTACT 


ACTCGCTA GCGGCCGC CACACTCTGTTCTTCTG 


CP7347P 
CP7353P 
CP7193P 


GTGCGT CATATG TTTTCTAAGGATTTGACTAA 
GTGCGT CATATG AATATGCCTGTTCCTTCT 
GTGCGT CATATG TGTTCCCTGGATCCT 


GCGT CTCGAG CGAAGCAGAAGTCGT 
GCGT CTCGAG GGGGCGTAGGTTGTA 


CP7248P 
CP7261P 
CP7280P 
CP7302P 


GTGCGT GCTAGC CT TG AACA TTC TAAAC AAG AT 
GTGCGT CATATG TGTCTATCTGCCTACATAG 
GTGCGT CATATG GACCAGAAAATTGAAAA 
GTGCGT CATATG AATTTCCATTGTAGTGTAGT 


ACTCGCTA GCGGCCGC AGTTATCACTATATCCACAAG 
GCGT CTCGAG ACGTAGTTTAAGAGCAGACT 
GCGT CTCGAG TTTTGATGCTTCTTTCA 
GCGT CTCGAG AGAGGTCTTCTGAGTGC 
GCGT CTCGAG GAACAGTTCGATTTGTG 


CP7306P 
CP7367P 
CP7408P 
CP7409P 


GTGCGT CATATG CTTCCTTTATCAGGGCA 
GTGCGT GCTAGC CGTTATGCCGAGGTC 
GTGCGT CATATG TTGAAAATC CAG AAAAA 
GTGCGT CATATG AGACGTTATCTTTTCATGGT 


ACTCGCTA GCGGCCGC TTCTTCAGGTTTCAGG 
GCGT CTCGAG TTCGTGCATTTGGTG 
GCGT CTCGAG ATTCATTTTCGGAAGAG 
GCGT CTCGAG CCCTTTGCTCTTTACATAG 


CP6733P 
CP6728P 


GTGCGT ACTAGT TGTCACCTACAGTCACTAG 
GTGCGT ACTAGT AAGTCCTCTGTCTCTTGG 


GCGT CTCGAG GAATCGGAGTTTGGTA 
GCGT CTCGAG GAAACAAAACTTAGAGCCC 



TABLE m - Proteins with best results in FACS analysis 



II cp number 


Molecular Weight (kDa) 




Theoretical 


Western Blot 


Fusion type 


6260 


97.5 


94; 70 


GST 


6270 


87.5 




GST 


6272 


78.0 


90 


GST 


6273 


58.6 


74; 64; 50 


GST 


6296 


31.1 




GST 


I 6390 


88.9 


102 


GST 


6456 


42.5 


89; 67,45 


GST 


6466 


57.5 


59; 56 


His 


6467 


59.0 


67 


GST 


6552 


28.4 


50; 27 


GST 


6576 


86.0 


79; 70; 62; 45 


GST 


6577 


17.3 


12 


GST 


6602 


43.4 


53; 42; 34 


GST 


6664 


54.5 


104; 45 


GST 


6696 


47.9 


95; 53 


GST 


6727 


130.0-142.9 


123; 61; 39 


His 


6729 


94.8 


multiple bands 


GST 


6731 


95.5 


97 


GST 


6733 


97.1 


104 


His 


6736 


100.1 


98; 93; 66; 60 


GST 


6737 


101.2 


multiple bands 


GST 


6751 


100.2 


95; 71 


GST 


6752 


102.1 


97; 48 


His 


6767 


29.1 


28 


GST 


6784 


32.9 


35 


GST 


6790 


71.3 


multiple bands 


His 


6802 


29.7 




GST 


1 6814 


29.6 


28 


GST J 
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» i. ...l.. i_n..ni ,,, j ^ 

6830 


,177.4 


174; 91; 13 


GST 


6849 


57.3 


multiple bands 


GST 


6850 


7.4-9.4 


61; 14; 8 


GST 


6854 


42.2 


- 


GST 


6878 


40.4 


- 


GST 


6900 


28.0 


- 


GST 


6960 


25.6 


75; 35 


GST 


6968 


34.6 


83; 53; 35 


GST 


6998 


39.3 


multiple bands 


GST 


7033 


68.2 


multiple bands 


GST 


7101 


113 


105 


GST 


7102 


63.4 


_ 


GST 


7105 


29.2 


30 


GST 


7106 


39.5 


72;46 


GST 


7107 


71.4 


67; 31 


His 


7108 


35.9 


35 


GST 


7111 


46.1 


51 


GST 


7132 


17.9 


57; 47; 17 


His 


7140 


36.2-29.8 


50; 38; 34 


GST 


7170 


34.4 


77; 33 


GST 


7224 


39.4 


40 


GST 


7287 


167.3 


180 


GST 


J 7306 


50.1 


50 


GST 



TABLE IV - FACS-positive proteins not found in Ctrachomatis 



cp7105 


cp6390 


cp7106 


cp6784 


cp7107 


cp6296 


cp7108 





TABLE V - Proteins identified by MALDI-TOF following 2D electrophoresis 



cp6270 


cp6733 


cp6900 


cp6552 


cp6736 


cp6960 


cp6576 


cp6737 


cp6998 


cp6577 


cp6752 


cp7033 


cp6602 


cp6767 


cp7108 


cp6664 


cp6784 


cp7111 


cp6727 


cp6790 


cp7170 


cp6728 


cp6830 


cp7287 


cp6729 


cp6849 


cp7306 
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CLAIMS 

1 • A protein comprising an amino acid sequence selected from the group consisting of SEQ IDs 97, 
1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53,' 
55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 99, 101, 103, 105] 
107, 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139,' 141,' 143,' 
145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177,' 179,' 181,' 
183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 203, 205, 207, 209, 211, 213, 215,' 217,' 219,' 
221, 223, 225, 227, 229, 231, 233, 235, 237, 239, 241, 243, 245, 247, 249, 251, 253^ 255^ 25?! 
259, 261, 263, 265, 267, 269, 271, 273, 275, 277, 279, 281, 283, 285, 287, 289, 291,' 293,' 295* 
297, 299, 301, 303, 305, 307, 309, 311, 313, 315, 317, 319, 321, 323, 325, 327, 329,' 331,' 333' 
335, 337, 339, 341, 343, 345, 347, 349, 351, 353, 355, 357, 359, 361, 363, 365, 36?! 369,' 37l' 
373, 375, & 377. 

2. A protein having 50% or greater sequence identity to a protein according to claim 1 . 

3. A protein comprising a fragment of an amino acid sequence selected from the group consisting of 
SEQ IDs 97, 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 
49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 99! 
101, 103, 105, 107, 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137,' 
139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173,' 175," 
177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 203, 205, 207, 209! 211,' 213! 
215, 217, 219, 221, 223, 225, 227, 229, 231, 233, 235, 237, 239, 241, 243, 245, 247,' 249! 251,' 
253, 255, 257, 259, 261, 263, 265, 267, 269, 271, 273, 275, 277, 279, 281, 283, 285! 287,' 289! 
291, 293, 295, 297, 299, 301, 303, 305, 307, 309, 311, 313, 315, 317, 319, 321, 323, 325,' 327,' 
329, 331, 333, 335, 337, 339, 341, 343, 345, 347, 349, 351, 353, 355, 357, 359, 361, 363, 365, 
367, 369, 371, 373, 375, & 377. 

4. A nucleic acid molecule which encodes a protein according to any one of claims 1 to 3. 

5. A nucleic acid molecule according to claim 4, comprising a nucleotide sequence selected from 
the group consisting of SEQ IDs 98, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 
36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84! 86,' 
88, 90, 92, 94, 96, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128,' 
130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 
168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204! 
206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238,' 24o! 242,' 
244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 266, 268, 270, 272, 274, 27e! 278, 280,' 
282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 312, 314,' 316,' 318,' 
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320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356, 
358, 360, 362, 364, 366, 368, 370, 372, 374, 376, & 378. 

6. A nucleic acid molecule comprising a fragment of a nucleotide sequence selected from the group 
consisting of SEQ IDs 98, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 
42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 9o[ 92,' 
94, 96, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134,' 
136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168,' 170,' 112, 
174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208^ 21o! 
212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246,' 248,' 
250, 252, 254, 256, 258, 260, 262, 264, 266, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286^ 
288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 312, 314, 316, 318, 320, 322,' 324^ 
326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356, 358^ 360,' 362^ 
364, 366, 368, 370, 372, 374, 376, & 378. 

7. A nucleic acid molecule comprising a nucleotide sequence complementary to a nucleic acid 
molecule according to any one of claims 4 to 6. 

8. A nucleic acid molecule comprising a nucleotide sequences having 50% or greater sequence 
identity to a nucleic acid molecule according to any one of claims 4 to 7. 

9. A nucleic acid molecule which can hybridise to a nucleic acid molecule according to any one of 
claims 4 to 8 under high stringency conditions. 

10. A composition comprising a protein or a nucleic acid molecule according to any preceding claim. 

1 1. A composition according to claim 10 being a vaccine composition. 

12. A composition according to claim 10 or claim 1 1 for use as a pharmaceutical. 

13. The use of a composition according to claim 10 in the manufacture of a medicament for the 
treatment or prevention of infection due to Chlamydia bacteria, particularly Chlamydia 
pneumoniae. 
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FIGURE 1 



Fig. 1A 



Fig. 1B 
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Fig. fC 



Negative control 
Anti-6552 mouse sera 
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FIGURE 2 
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Fig. 2A 



Fig. 2B 
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Fig. 2C 
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FIGURE 3 



Fig. 3 A 




Fig. 3B 
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FIGURE 4 



Fig. 4 A 
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